fengyang0317 / unsupervised_captioning

Code for Unsupervised Image Captioning
MIT License
215 stars 51 forks source link

Mismatch in object_detection model #12

Closed anilkagak2 closed 5 years ago

anilkagak2 commented 5 years ago

The object detection model you are using to detect the potential objects in the image seems to be incorrect.

Your readme points to the faster_rcnn_inception_resnet_v2_atrous_oidv2 detector with the url link http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28.tar.gz

The label map used by the detector is https://github.com/tensorflow/models/blob/r1.13.0/research/object_detection/data/oid_bbox_trainable_label_map.pbtxt (this has < 600 labels). While your input data train.tfrec and rest of the records have more than this many labels. Any idea why is there a discrepancy?