alirezazareian / vspnet

Code for the CVPR 2020 oral paper: Weakly Supervised Visual Semantic Parsing
https://arxiv.org/abs/2001.02359
35 stars 6 forks source link

SGCls and PREDCls #15

Closed yekeren closed 3 years ago

yekeren commented 3 years ago

Dear author, I wonder to know the details of the evaluation. For SGCls and PREDCls, the model should know the box information for both subject and object. So, how do you distinguish that the 20x1536 gt-box features are associate with subject or object? It seems the 20x1536 gt-box features are just the same as proposal feature.

alirezazareian commented 3 years ago

Not sure if I understood your questions correctly. In SGCls, we treat GT boxes as proposals, and we get every possible pair and classify their predicates. So for every pair A, B, we both evaluate A as subject and B as object, as well as the other way around. At the end all the predicted triplets (e.g. A-x-B and B-y-A) are sorted and the top 100 are compared to GT triplets, and we count how many of the GT triplets are correctly recovered.

yekeren commented 3 years ago

Yes, this is what I understand about your implementation. However, it seems to me in SGCls and PREDCls, GT boxes mean GT-boxes pair. For example, SGCls means given two boxes (subject and object boxes), predict subject-predicate-object text labels; PREDCls means given (subject, subject-box, object, object-box), predict the predicate labels. So, the eval protocol is actually different (though I admit your scenario should be more challenging).

yekeren commented 3 years ago

Sorry, I think I now understand your points. Your eval protocol is correct.

For the eval protocol, IMP described: "(SGCLS) task is to predict the predicate as well as the object categories of the subject and the object in every pairwise relationship given a set of localized objects."

Your description: "we also report SGCLS, which assumes ground truth bounding boxes are given at test time, instead of proposals. Another metric, PREDCLS assumes ground truth bounding are given, and true object classes are given too." I now know your evaluation is the same as IMP.

Sorry for the misunderstanding.