Could you release the checkpoint ? And the code can not reproduce the same result as reported in the paper.
I doubt that the evaluation is unfair and incorrect, and the true performance is not as well as reported in the paper.
I have run experiments with your code, and fix some bugs, but it still cannot address my above concerns.
Hi, I would first like to thanks your contribution. However, I think there exist some issues of the evaluation.
For "SGDet" mode, you use "use_gt_filter" to filter out those boxes with less IoU overlaps . However, we have no ground truths during test. This is unfair to calculate the recall. And I have no found other codes use this method (https://github.com/rowanz/neural-motifs/blob/master/lib/evaluation/sg_eval.py, https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/master/maskrcnn_benchmark/data/datasets/evaluation/vg/sgg_eval.py).
Could you release the checkpoint ? And the code can not reproduce the same result as reported in the paper. I doubt that the evaluation is unfair and incorrect, and the true performance is not as well as reported in the paper. I have run experiments with your code, and fix some bugs, but it still cannot address my above concerns.