Fine tune vision model with multiple images

microsoft / Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.

MIT License

2.46k stars 245 forks source link

Fine tune vision model with multiple images #50

Closed pbarker closed 4 months ago

pbarker commented 4 months ago

Hello, is it possible to fine tune the vision model with multiple images?

leestott commented 4 months ago

yes see https://wandb.ai/byyoung3/mlnews3/reports/How-to-fine-tune-Phi-3-Vision-on-a-custom-dataset--Vmlldzo4MTEzMTg3

pbarker commented 4 months ago

Hey @leestott thanks for getting back so quick. I see how to do that with a single image, but not with multiple, are there any examples of how to do that?

leestott commented 4 months ago

Work in progress on samples

leestott commented 4 months ago

@pbarker please see https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md