microsoft / GenerativeImage2Text

GIT: A Generative Image-to-text Transformer for Vision and Language
MIT License
550 stars 68 forks source link

fine tune git #37

Closed nazar-karpov closed 1 year ago

nazar-karpov commented 1 year ago

can you give me some advices to finetune git model on my own on dataset if finetuning has any sense(video captioning task)

amsword commented 1 year ago

you can leverage the following example and wrap it with some trainer, e.g. deepspeed or the pytorch ddp. please also feel free to pose any specific issue.

https://github.com/microsoft/GenerativeImage2Text#Training