Gitsamshi / WeakVRD-Captioning

Implementation of paper "Improving Image Captioning with Better Use of Caption"
32 stars 7 forks source link

Question about the Data used #2

Closed tjuwyh closed 4 years ago

tjuwyh commented 4 years ago

Hi, thanks for your great work and I have a few questions.

  1. How did you get the object labels for detected regions from Bottom-Up model. Because it seems to be not included by the official repo.
  2. How did you implement the weakly supervised multi-instance learning described in paper? I didn't figure it out that where is the corresponding computing process in this code. By the way, I'm looking forward to the release of the data and I wish to follow this work after this. Thanks a lot!
Gitsamshi commented 4 years ago

Thank you for asking. Please refer to the google drive link for data and trained model 1 The object labels are from SGAE coco_img_sg folder, you may refer to data/prepro_predicates file for more details. 2 The module was modified from neural-motif, hard to merge into this repo due to different running environment. I am planning to set up another repo for it, you may refer to data/coco_cmb_vrg as the output of that module for current use.

tjuwyh commented 4 years ago

Thank you for asking. Please refer to the google drive link for data and trained model 1 The object labels are from SGAE coco_img_sg folder, you may refer to data/prepro_predicates file for more details. 2 The module was modified from neural-motif, hard to merge into this repo due to different running environment. I am planning to set up another repo for it, you may refer to data/coco_cmb_vrg as the output of that module for current use.

Thanks for your reply!