KeyError on open_world_eval.py - custom dataset

Hi Orr,

I'm trying to run the code on my custom data. Now the training starts but it seems like it stops by the end of the first epoch when it's evaluating the results, but I'm not sure why this is the case. Do you have any ideas about what could be causing this issue?

What I'm running:

python -u main_open_world.py \
    --output_dir "${EXP_DIR}/t1" --dataset fathomnet --PREV_INTRODUCED_CLS 0 --CUR_INTRODUCED_CLS 10\
    --train_set 'task1_train_ft' --test_set 'all_test' --epochs 1\
    --model_type 'prob' --obj_loss_coef 8e-4 --obj_temp 1.3\
    --wandb_name "${WANDB_NAME}_t1" --exemplar_replay_selection --exemplar_replay_max_length 850\
    --exemplar_replay_dir ${WANDB_NAME} --exemplar_replay_cur_file "task1_train_ft.txt"\
    ${PY_ARGS}

The error:

Epoch: [0] Total time: 0:34:28 (2.0770 s / it)
Averaged stats: lr: 0.000200  class_error: 94.23  grad_norm: 79.76  loss: 17.0468 (22.1405)  loss_bbox: 0.5209 (0.8674)  loss_bbox_0: 0.5404 (0.8677)  loss_bbox_1: 0.5110 (0.8647)  loss_bbox_2: 0.4962 (0.8638)  loss_bbox_3: 0.5127 (0.8650)  loss_bbox_4: 0.5242 (0.8643)  loss_ce: 0.9692 (1.1414)  loss_ce_0: 0.9991 (1.1387)  loss_ce_1: 0.9678 (1.1351)  loss_ce_2: 0.9384 (1.1382)  loss_ce_3: 0.9560 (1.1357)  loss_ce_4: 0.9526 (1.1354)  loss_giou: 1.2438 (1.5470)  loss_giou_0: 1.2195 (1.5511)  loss_giou_1: 1.2046 (1.5455)  loss_giou_2: 1.2071 (1.5467)  loss_giou_3: 1.1704 (1.5451)  loss_giou_4: 1.1755 (1.5464)  loss_obj_ll: 0.0963 (0.1336)  loss_obj_ll_0: 0.1126 (0.1440)  loss_obj_ll_1: 0.1067 (0.1403)  loss_obj_ll_2: 0.1058 (0.1419)  loss_obj_ll_3: 0.1018 (0.1404)  loss_obj_ll_4: 0.1040 (0.1409)  cardinality_error_unscaled: 2.4000 (3.4721)  cardinality_error_0_unscaled: 2.5000 (3.4544)  cardinality_error_1_unscaled: 2.6000 (3.4567)  cardinality_error_2_unscaled: 2.6000 (3.4612)  cardinality_error_3_unscaled: 2.6000 (3.4592)  cardinality_error_4_unscaled: 2.5000 (3.4591)  class_error_unscaled: 95.8333 (98.9971)  loss_bbox_unscaled: 0.1042 (0.1735)  loss_bbox_0_unscaled: 0.1081 (0.1735)  loss_bbox_1_unscaled: 0.1022 (0.1729)  loss_bbox_2_unscaled: 0.0992 (0.1728)  loss_bbox_3_unscaled: 0.1025 (0.1730)  loss_bbox_4_unscaled: 0.1048 (0.1729)  loss_ce_unscaled: 0.4846 (0.5707)  loss_ce_0_unscaled: 0.4996 (0.5693)  loss_ce_1_unscaled: 0.4839 (0.5676)  loss_ce_2_unscaled: 0.4692 (0.5691)  loss_ce_3_unscaled: 0.4780 (0.5679)  loss_ce_4_unscaled: 0.4763 (0.5677)  loss_giou_unscaled: 0.6219 (0.7735)  loss_giou_0_unscaled: 0.6098 (0.7756)  loss_giou_1_unscaled: 0.6023 (0.7727)  loss_giou_2_unscaled: 0.6036 (0.7733)  loss_giou_3_unscaled: 0.5852 (0.7726)  loss_giou_4_unscaled: 0.5877 (0.7732)  loss_obj_ll_unscaled: 120.4147 (166.9937)  loss_obj_ll_0_unscaled: 140.7218 (179.9712)  loss_obj_ll_1_unscaled: 133.4184 (175.3988)  loss_obj_ll_2_unscaled: 132.2273 (177.4261)  loss_obj_ll_3_unscaled: 127.2509 (175.5592)  loss_obj_ll_4_unscaled: 129.9692 (176.1760)
testing data details
21
20
('Urchin', 'Fish', 'Sea star', 'Anemone', 'Sea cucumber', 'Sea pen', 'Sea fan', 'Worm', 'Crab', 'Gastropod')
('Urchin', 'Fish', 'Sea star', 'Anemone', 'Sea cucumber', 'Sea pen', 'Sea fan', 'Worm', 'Crab', 'Gastropod', 'Shrimp', 'Soft coral', 'Glass sponge', 'Feather star', 'Eel', 'Squat lobster', 'Barnacle', 'Stony coral', 'Black coral', 'Sea spider', 'unknown')
/home/sabrina/code/PROB/models/prob_deformable_detr.py:537: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  topk_boxes = topk_indexes // out_logits.shape[2]
Test:  [  0/167]  eta: 0:02:11    time: 0.7896  data: 0.4684  max mem: 22891
/home/sabrina/code/PROB/models/prob_deformable_detr.py:537: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  topk_boxes = topk_indexes // out_logits.shape[2]
Test:  [ 10/167]  eta: 0:00:57    time: 0.3635  data: 0.0485  max mem: 22891
Test:  [ 20/167]  eta: 0:00:50    time: 0.3197  data: 0.0066  max mem: 22891
Test:  [ 30/167]  eta: 0:00:46    time: 0.3211  data: 0.0065  max mem: 22891
Test:  [ 40/167]  eta: 0:00:42    time: 0.3221  data: 0.0063  max mem: 22891
Test:  [ 50/167]  eta: 0:00:38    time: 0.3234  data: 0.0064  max mem: 22891
Test:  [ 60/167]  eta: 0:00:35    time: 0.3250  data: 0.0065  max mem: 22891
Test:  [ 70/167]  eta: 0:00:31    time: 0.3213  data: 0.0062  max mem: 22891
Test:  [ 80/167]  eta: 0:00:28    time: 0.3114  data: 0.0060  max mem: 22891
Test:  [ 90/167]  eta: 0:00:24    time: 0.3013  data: 0.0059  max mem: 22891
Test:  [100/167]  eta: 0:00:21    time: 0.3026  data: 0.0058  max mem: 22891
Test:  [110/167]  eta: 0:00:18    time: 0.3170  data: 0.0060  max mem: 22891
Test:  [120/167]  eta: 0:00:15    time: 0.3320  data: 0.0065  max mem: 22891
Test:  [130/167]  eta: 0:00:11    time: 0.3400  data: 0.0068  max mem: 22891
Test:  [140/167]  eta: 0:00:08    time: 0.3419  data: 0.0067  max mem: 22891
Test:  [150/167]  eta: 0:00:05    time: 0.3419  data: 0.0067  max mem: 22891
Test:  [160/167]  eta: 0:00:02    time: 0.3455  data: 0.0068  max mem: 22891
Test:  [166/167]  eta: 0:00:00    time: 0.3433  data: 0.0066  max mem: 22891
Test: Total time: 0:00:54 (0.3289 s / it)
Urchin has 5789 predictions.
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
Traceback (most recent call last):
  File "/home/sabrina/code/PROB/main_open_world.py", line 475, in <module>
    main(args)
  File "/home/sabrina/code/PROB/main_open_world.py", line 343, in main
    test_stats, coco_evaluator = evaluate(
  File "/home/sabrina/mambaforge/envs/prob/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/sabrina/code/PROB/engine.py", line 151, in evaluate
    coco_evaluator.accumulate()
  File "/home/sabrina/code/PROB/datasets/open_world_eval.py", line 136, in accumulate
    self.num_unk, self.tp_plus_fp_closed_set, self.fp_open_set = voc_eval(lines_by_class, \
  File "/home/sabrina/code/PROB/datasets/open_world_eval.py", line 410, in voc_eval
    R = class_recs[image_ids[d]]
KeyError: '3895_'

orrzohar / PROB

KeyError on open_world_eval.py - custom dataset #33