Closed theneotopia closed 2 years ago
Hi, did you manage to resolve this? Any help is much appreciated! Thanks
Hi there,
It's been a while since the issue, if my memory is correct, this bug is raised by the following snippet:
https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/4b6b71a90d4198d9dae574d42b062a5e534da291/maskrcnn_benchmark/data/datasets/visual_genome.py#L177-L183
There are several images which has only 1 bouding box, while they were remove with the above code when making predictions.
https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/d0ffa40d92133d7d865e531146de82c8c8a344c0/maskrcnn_benchmark/engine/inference.py#L40-L45
When performing all_gather to obtain predictions from multi-gpus, those empty predictions would cause the key (img_id
) and value (result
) totally mismatches in the multi_gpu_predictions
dictionary, starting from the img_id
of an empty prediction.
Thus when evaluation, "Num of GT boxes is not matching with num of pred boxes in SGCLS" would occur because the prediction even not belongs to that image.
Set SYNC_GATHER=False
would solve this problem, hope this can help you ^_^ (I check my git log, the commit message only includes the solution not the cause, hope my analysis is not wrong : ) )
Sounds good, thanks a lot.
On Mon, Feb 13, 2023 at 10:34 PM Neo @.***> wrote:
Hi there, It's been a while since the issue, if my memory is correct, this bug is raised by the following snippet:
https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/4b6b71a90d4198d9dae574d42b062a5e534da291/maskrcnn_benchmark/data/datasets/visual_genome.py#L177-L183 There are several images which has only 1 bouding box, while they were remove with the above code when making predictions.
https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/4b6b71a90d4198d9dae574d42b062a5e534da291/maskrcnn_benchmark/data/datasets/evaluation/vg/vg_eval.py#L53 When performing all_gather to obtain predictions from multi-gpus, those empty predictions would cause the key (img_id) and value (result) totally mismatches in the multi_gpu_predictions dictionary, starting from the img_id of an empty prediction. Thus when evaluation, "Num of GT boxes is not matching with num of pred boxes in SGCLS" would occur because the prediction even not belongs to that image. Set SYNCGATHER=False would solve this problem, hope this can help you ^^ (I check my git log, the commit message only includes the solution not the cause, hope my analysis is not wrong : ) )
— Reply to this email directly, view it on GitHub https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/issues/183#issuecomment-1429061980, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEM5DONTROKJCTTYTRGLUDWXL4LFANCNFSM6AAAAAAQEJLAWM . You are receiving this because you commented.Message ID: @.*** com>
-- Thanks and Warm Regards
Vardaan Pahuja
❓ Questions and Help
Hi, @KaihuaTang , thx for your excellent work, there is one question I'd like to bother you. When evaluating the performance of SGCLS, "Num of GT boxes is not matching with num of pred boxes in SGCLS" message has been frequently logged, and reported all 0 results. Sometimes the training will be stuck after PRE_VAL, and this kind of abnormal phenomena ONLY occurred in multi-gpus training (I use 4 2080Ti with a total batch size of 16). Everything is fine when I ONLY use 1 gpu for training and evaluation. Any idea about this?