haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.33k stars 2.13k forks source link

[Feature request] Support Llama3 #1426

Open thesby opened 5 months ago

thesby commented 5 months ago

feature

Could you please support Llama3 in Llava ?

awzhgw commented 5 months ago

+1

HarryHsing commented 5 months ago

+1

iMountTai commented 5 months ago

+1

GoGoJoestar commented 5 months ago

+1

manbehindthemadness commented 5 months ago

+1

dingtine commented 4 months ago

i have trained llava with llama3 model, but the generate result is not correct.

Namzakku commented 4 months ago

@dingtine can you specify more about the result? also, which base model did you train on?

manbehindthemadness commented 4 months ago

@dingtine there is some mention of an abnormality regarding an end/termination token mentioned here: https://x.com/bartowski1182/status/1782206933719515467?s=46&t=iIhAbXdfE1VCk7vAgMnlRQ as this came out just now, it might affect your results.

mmaaz60 commented 4 months ago

Hi @thesby @awzhgw , @Namzakku @manbehindthemadness,

I hope you are doing well. We have just released our project LLaVA++: Extending Visual Capabilities with LLaMA-3 and Phi-3, which features LLaMA-3 and Phi-3-Mini based LLaVA models. Please have a look at this at LLaVA++.

Further, as pointed out by @manbehindthemadness, the issues related to generation have been fixed in the recent update of generation_config.json and tokenizer.json at meta-llama/Meta-Llama-3-8B-Instruct.

In case if you face any issue in running/training LLaMA-3 or Phi-3-Mini based LLaVA models, please let me know.

manbehindthemadness commented 4 months ago

Fantastic!