I'm confused about the evaluation of predicate detection.
The paper says
our input is an image and set of localized objects. The task is to predict a set of possible predicates between pairs of objects
But in predicate_detection.m, the predictions are made for each groundtruth relation. It takes a groundtruth (subject, object) pair as input and classifies the predicate.
If that is the case, the model has access to not only an image and set of localized objects, but also:
Whether or not a pair of objects have a relation
(If they do have certain relation) The order of the two objects (which one is the subject and which one is the object)
Could you please clarify what is the correct setup for the predicate detection task? Thanks.
Yes, the model knows which label is the subject and which one is the object. The model can also predict and say that there is no relationship that exists between the given subject-object pair.
Hello,
I'm confused about the evaluation of predicate detection. The paper says
But in
predicate_detection.m
, the predictions are made for each groundtruth relation. It takes a groundtruth (subject, object) pair as input and classifies the predicate. If that is the case, the model has access to not onlyan image and set of localized objects
, but also:Could you please clarify what is the correct setup for the predicate detection task? Thanks.