bigger model support - Githubissues

zjysteven / lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, qwen-vl, phi3-v etc.

Apache License 2.0

122 stars 8 forks source link

bigger model support #28

Open gn64 opened 3 weeks ago

gn64 commented 3 weeks ago

Thank you for your excellent work. I believe llava-1.6 currently supports 7b/13b models, but do you have any plans to expand this to larger models (such as llava-hf/llava-v1.6-34b-hf, llava-hf/llama3-llava-next-8b-hf, or llava-hf/llava-next-72b-hf)?

zjysteven commented 3 weeks ago

It's definitely possible to support them.

For bigger models like 34b and 72b: I haven't included them just because I might not be able to actually run/test them given my available computing resources. But I can definitely follow the smaller variant to incorporate them into the framework.
For llama3-llava-next-8b-hf: There's a tricky reason for why I haven't included it (but it's something that I can work around).

We will include them asap. But again be aware that for bigger models (34b, 72b) we might not be able to actually run/test it.

shamanthak-hegde commented 3 weeks ago

Hi, Just a continuation of the above question, is it possible to finetune llava-next-qwen-32b? If so, when can I expect it to be supported with this repo? Or if you could point me in the direction of what changes need to be made, I can do it. Thanks

zjysteven commented 3 weeks ago

@shamanthak-hegde Do you mean llava next video qwen 34b? If so I imagine its almost the same as 7b, where you can simply add the model identifier in supported_models.py and that should be it. Everything else should already be implemented together with the 7b model.

shamanthak-hegde commented 3 weeks ago

Got it. That works. Thank you, I appreciate the quick reply