Recommended Batch Size for Fine-tuning

Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

https://otter-ntu.github.io/

MIT License

3.54k stars 242 forks source link

Recommended Batch Size for Fine-tuning #269

Closed da-yama82 closed 11 months ago

da-yama82 commented 11 months ago

Hi, thanks for your great work!

I have a question about the batch size during fine-tuning.

I'm planning to fine-tune the "OTTER-Image-MPT7B" using my custom In-Context Learning dataset. How much does the batch size affect the accuracy? Could you recommend an appropriate batch size? The dataset consists of 20,000 sets (2 context images and 1 query image each).