Error: Do not have '2_GT_True.npz' when inferring test sets

PJLallen commented 1 year ago

Hi, I try to infer on the test set using 3DSSG, followed by "python -m scene_graph_prediction.main --config no_gt_image.json". But an error is displayed: Original Traceback (most recent call last): File "/home/allen/anaconda3/envs/4dor/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/allen/anaconda3/envs/4dor/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/allen/anaconda3/envs/4dor/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/allen/MI-projects/4D-OR/scene_graph_prediction/scene_graph_helpers/dataset/or_dataset.py", line 88, in getitem human_name_to_3D_joints = np.load(str(OR_4D_DATA_ROOT_PATH / 'human_name_to_3D_joints' / f'{take_idx}_GT_True.npz'), allow_pickle=True)[ File "/home/allen/anaconda3/envs/4dor/lib/python3.7/site-packages/numpy/lib/npyio.py", line 417, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: '/4D-OR/human_name_to_3D_joints/2_GT_True.npz'

I checked the dataset and found that there is indeed no ‘2_GT_True.npz' and '6_GT_True.npz'. how can I get the result of 'infer'? Thank you.

egeozsoy commented 1 year ago

The test set is hidden, and we do not provide labels for it. Instead you need to do an inference, and use our website https://bit.ly/4D-OR_evaluator for evaluating scene graph generation results on the test set. Hope this helps. feel free to ask if anything is unclear

PJLallen commented 1 year ago

Based on the above error, I am not able to do the inference on your model. Thus it would not be possible to evaluate further online. I'm just curious why the model needs to use ‘GT_True.npz' in the infer phase？

egeozsoy commented 1 year ago

You are right, thanks for bringing this to our attention. Indeed that file is not actually used, but the current implementation required it. I now did a quick fix, can you pull the newest version of the repository and see if it works for you now?

PJLallen commented 1 year ago

It can be run after deleting： if value in human_name_to_3D_joints[pcd_idx]:
instance_label_to_hand_locations[key] = human_name_to_3D_joints[pcd_idx][value][8:10] in line.88, 89 of dataet_utils.py. Thanks.

BTW, I'm also wondering if the online evaluator is the same as the ‘classification_report’ used for eval? Because 'classification_report' needs to align the number of 'rel_gts' exactly with the number of 'rel_preds' when evaluation.

egeozsoy commented 1 year ago

Thanks for the bug fix suggestion.

Your point regarding evaluation is indeed correct, as we state in our readme:

You can evaluate on the test set as well by using https://bit.ly/4D-OR_evaluator and uploading your inferred predictions. Be aware that compared to the evaluation in the paper, this evaluation does not require human poses to be available, and therefore can slightly overestimate the results. We get a macro 0.76 instead of 0.75 on test set.

The results can deviate, however very slightly. Starting from our most recent work https://arxiv.org/abs/2303.13293, we are also reporting the results received from this tool, so it should be comparable.

PJLallen commented 1 year ago

Thanks for your kindly feedback.

Actually, I want to know if 'https://bit.ly/4D-OR_evaluator' is computed in the same way as the metric used for eval:

cls_report = classification_report(all_rel_gts, all_rel_preds, labels=list(range(len(self.relationNames))), target_names=self.relationNames)

Because I found that if the model does not use 'gt_true.npz' in "evaluate", it leads to a mismatch between the number of 'all_rel_preds' of eval and 'all_rel_preds' of GT, which results in an error in the calculation of 'classification_report'. But this can be successfully runinng on the 'https://bit.ly/4D-OR_evaluator'.

egeozsoy commented 1 year ago

Indeed the evaluator deals with this differently. We have a logic that looks like the following: 1) Create a template preds variable, initializing with 'none' with the length of ground truth. 2) Given the predictions provided by the user (by uploading), for each individual triplet do one of the following:

If the subject and object combination (tuple) do not occur in the ground truth, dismiss this prediction
If the subject and object combination occurs in the ground truth, replace the default "none" value with the predicted relation name.

This way, we guarantee all_rel_gts and all_rel_preds being the same. Essentially those predictions that did not match the subject and object in ground truth are thrown out.

PJLallen commented 1 year ago

I've got it, thank you so much!

egeozsoy / 4D-OR

Error: Do not have '2_GT_True.npz' when inferring test sets #2