alibaba / AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
Apache License 2.0
1.98k stars 291 forks source link

Fine-tuning video captions #76

Open dvirla opened 1 year ago

dvirla commented 1 year ago

Thanks for your great work! I', trying to ,modify the image training code to video captioning fine tuning, but there are somethings that doesn't quite clear to me how to modify like using "answer" parameter in MPLUG model. Could you please release a train framework for this task?

I'm using vatex_video_caps_dataset class to load my dataset.

Thanks!

dvirla commented 1 year ago

I think I've figured it out, I modified the dataset and the train call to pass the real captio as the "answer", is that the right way? If so, I can create a pull request for you to add this.