GriffinLiang / vrd-dsr

Code for Visual Relationship Detection with Deep Structural Ranking (AAAI2018)
121 stars 32 forks source link

Question on the object detection phase #6

Closed yangshao closed 6 years ago

yangshao commented 6 years ago

As in the detection phase, both the bounding box and bounding box labels are required, so are you using fast-rcnn's pre-trained model or you use fast-rcnn to re-train on the vrd and vg dataset? Besides, as ranking loss require sample negative predicates, how do you do the sampling?

GriffinLiang commented 6 years ago

For detection phase, it is better to use the pre-trained model from Pascal or COCO dataset. For the proposed ranking loss, we use all the negative predicates without sampling.

yangshao commented 6 years ago

@GriffinLiang If using the pre-trained model, then there will be some labels which does not occur on the target object vocabulary, how do you handle this?

GriffinLiang commented 6 years ago

Discard the last layer from the pre-trained model, initialize it randomly and fine-tune the whole model on the target dataset.

yangshao commented 6 years ago

@GriffinLiang I'm still confused by the data split for the relation detection task. during the training phase, the gold bounding box and gold labels are used? as the training data is both used for object detection and relation detection.

GriffinLiang commented 6 years ago

Yes. The ground-truth boxes and labels are used for both object detection and relation detection.