The Evaluation of Predicate Detection

Hello,

I'm confused about the evaluation of predicate detection. The paper says

our input is an image and set of localized objects. The task is to predict a set of possible predicates between pairs of objects

But in predicate_detection.m, the predictions are made for each groundtruth relation. It takes a groundtruth (subject, object) pair as input and classifies the predicate. If that is the case, the model has access to not only an image and set of localized objects, but also:

Whether or not a pair of objects have a relation
(If they do have certain relation) The order of the two objects (which one is the subject and which one is the object)

Could you please clarify what is the correct setup for the predicate detection task? Thanks.

Prof-Lu-Cewu / Visual-Relationship-Detection

The Evaluation of Predicate Detection #4