NVIDIA / ContrastiveLosses4VRD

Implementation for the CVPR2019 paper "Graphical Contrastive Losses for Scene Graph Generation"
Other
200 stars 41 forks source link

[Question] SGDet vs SGCls (VG) #20

Open sharifza opened 4 years ago

sharifza commented 4 years ago

I have a question. I don't understand why (in Visual Genome) SGDet gains such a small improvement compared to Neural Motifs whereas SGCls has gains such a larger improvement? Isn't the only difference in the region proposal network?

sharifza commented 4 years ago

Now I understand that your reported numbers are in fact not comparable to Neural Motifs. I consider this some sort of [unintended?] mistake in reporting the results.

In NM (and most of the previous works), SGCls is defined as a setting where bounding boxes are given, while edges are not, and we evaluate the quality of "detected" and "classified" edges. In your work, you have updated the definition of SGCls to a setting where bounding boxes and edges are given and the goal is to evaluate the quality of "classifying" edges. While I understand your motivation behind this change (given the name "Scene Graph Classification"), putting these under the same title in the table, will totally mislead the community.

bknyaz commented 4 years ago

@sharifza if you could share the code fixing the evaluation of the models in this repo, it would be great! I still see they rank triplets here https://github.com/NVIDIA/ContrastiveLosses4VRD/blob/master/lib/datasets_rel/task_evaluation_vg_and_vrd.py#L84, so I'm not sure where exactly their evaluation goes wrong.

sharifza commented 3 years ago

@bknyaz I avoided using this repository for my research. No one responded to my complaint for a year. The mentioned evaluation issue affects the heart of this paper's contribution and questions the validity of everything. There are other repositories that I recommend you to take a look at: Neural Motifs [PyTorch 0.3], Depth-VRD (Neural Motifs [PyTorch > 1.0]), and the recent benchmark by @kaihuatang. Kaihua also pointed out this issue here. (Two Common Misunderstandings in SGG Metrics).

sigeek commented 3 years ago

The main problem is that the evaluation for VRD and VG is done in the same file even if the metrics are slightly different. The metrics used in VRD are the following:

The metrics used in VG are:

In PredDet, the pairs (subject, object) are given as pointed in this issue, whereas in PredCls and SGCls are not. This is the problem related to this implementation.

Hope this helps! 👍