Finetuning on specific datasets

NVlabs / EAGLE

EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

https://arxiv.org/pdf/2408.15998

Apache License 2.0

543 stars 45 forks source link

Finetuning on specific datasets #10

Open HashmatShadab opened 2 months ago

HashmatShadab commented 2 months ago

Is there an option in the codebase to do the finetuning on only selected datasets mentioned in the readme?

flyinglynx commented 2 months ago

Yes, you can convert your dataset into LLaVA's format and update the data path in the script accordingly.

Essentially, you'll need to transform your annotations into a list of conversation data. For more details, please refer to the example annotation JSON file. If you run into any issues, feel free to reach out to us.

If you dataset is small, please consider using efficient tuning techniques like LoRA.

HashmatShadab commented 2 months ago

Thank you for explaining. I was specifically talking about using specific datasets for finetuning that are mentioned in the readme. So for that i can just update the json file