X-PLUG / mPLUG-2

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Apache License 2.0
220 stars 18 forks source link

How to perform CIDEr optimization? #4

Closed KaiGod0730 closed 1 year ago

KaiGod0730 commented 1 year ago

Thank you for your excellent work! The paper mentions that CIDEr optimization has been performed for extra 5 epochs in video captioning task. How to run CIDEr optimization using your code?

MAGAer13 commented 1 year ago

Please refer to the repo of mPLUG. https://github.com/X-PLUG/mPLUG/blob/main/caption_mplug_scst.py

KaiGod0730 commented 1 year ago

In the repo of mPLUG-2, model_video_caption.py, Line 105, I found 'MPLUG2' object has no attribute '_flatten_time'. How to use '_flatten_time'? @MAGAer13

MAGAer13 commented 1 year ago

Hi, it's a legacy in the model. We do not use self.beam_search in the model.