Closed pedrocolon93 closed 3 months ago
As a side note also needs to install flash attention: pip install flash-attn
As a second side note same thing for the -dpo model.
FIxed this by cloning the repo and adding -qwen- in the name of the repo... Otherwise it loads some other Llava architecture which does not work.
If loading in 4 bit, the line in builder.py kwargs["load_in_4bit"] = True needs to be commented.
You should change the model path from llava-next-interleave-7b to llava-next-interleave-qwen-7b and try again.
Thanks, and double check the:
If loading in 4 bit, the line in builder.py
kwargs["load_in_4bit"] = True
needs to be commented
and adding in the flash-attn dependency
Hi there! I clone the tensors here: git clone https://huggingface.co/lmms-lab/llava-next-interleave-7b And I do the setup as is in the readme (which needs an upgrade for gradio (pip install --upgrade gradio) and needs numpy==1.23.0) and when i do inference in gradio (with the examples) I get garbage. Is there anything I am missing?