YiwuZhong / Sub-GC

[ECCV 2020] Official code for "Comprehensive Image Captioning via Scene Graph Decomposition"
MIT License
93 stars 15 forks source link

About preprocess code #2

Open chengyj97 opened 3 years ago

chengyj97 commented 3 years ago

Is there any preprocess code(like sub-graph sampling or gt sub-graph extraction) that you can share with us? Thanks a lot!!

1216143369 commented 3 years ago

Can you share the code of sub-graph sampling or sub-graph extraction? Thanks a lot!

YiwuZhong commented 3 years ago

Thanks for your interest in our work!

We'll try to get the preprocessing code released after CVPR.

JingyuLi-code commented 3 years ago

Thanks for your great work! In your paper Table 3, Sub-GC-oracle is an upper bound by assuming that we have an oracle ranking function, i.e., always selecting the maximum scored sentence for each metric. How do you solve it in your code? Thanks a lot!

YiwuZhong commented 3 years ago

Please refer to the section "Top-1 Accuracy Evaluation" in README, that is, "set --only_sent_eval to 1 and add --orcle_num 1000 in test.sh, and rerun the bash file".

JingyuLi-code commented 3 years ago

Yes, The upper bound is promising!!! Can you share the code of sub-graph sampling or sub-graph extraction now? Thanks a lot!

Zora-zjj commented 3 years ago

hi, thanks for the sharing, I had some problems about generating caption with this code:

Traceback: ../ data Loaders/dataloadertest.py", line 198, in get batch data['att masks'] = tmp att mask. view(-1,2, tmp this_ minibatch, self .obj num) RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 2, 0, 37] because the unspecified dimension size -1 can be any value and is ambiguous

An error is generated every time when the caption for the NO.100432 image is generated. Looking forward to your reply!

SikandarBakht commented 2 years ago

Hi, can we get the preprocessing code for sub-graph sampling? Also, I'd like some insights into getting custom images captioned with this code. I know there is an option for custom image captioning in the evaluation script but it does not have the accompanying code to produce it. Any help would be appreciated.

zaryabmakram commented 2 years ago

Hi @YiwuZhong , Really appreciate your work and making the code available. Can you kindly also make the preprocessing scripts available as well. It would be great if you could provide some insights on how we could re-train the model on a custom dataset.

AleDella commented 1 year ago

Hi, is available somewhere the code to produce scene graphs from a set of images? In case we want to use your model on other datasets that is not COCO or Flickr. If not yours, could you explain how you produced them?

Thank you very much in advance!

YiwuZhong commented 1 year ago

Hi, is available somewhere the code to produce scene graphs from a set of images? In case we want to use your model on other datasets that is not COCO or Flickr. If not yours, could you explain how you produced them?

Thank you very much in advance!

Hi @AleDella, thanks for your interests in our work.

As mentioned in the Implementation Details of paper, we first used Bottom-up object detector to detect objects from images and to extract region features. Using these region features as inputs, Motif-Net was trained to generate scene graphs from images. This is the model checkpoint of Motif-Net I trained and used to generate scene graphs. As a reference for saving scene graphs into local files, you might be able to use my script to replace the original file in Motif-Net codebase with some adaption as needed.

PS: There is another codebase for Bottom-up object detector to extract region features (bottom-up-attention.pytorch). This is my work for scene graph generation with image captions as only supervision (SGG_from_NLS).