Closed camenduru closed 11 months ago
Great work!!! ... I just ran both Colab files. Works fine. But models are not downloading and not showing in the Gradio drop-down too.
Thanks ❤ @AmmarFahmy please watch this https://www.youtube.com/watch?v=o7zQAa0NPds
Thanks ❤ @AmmarFahmy please watch this https://www.youtube.com/watch?v=o7zQAa0NPds
Yes great .. worked fine.
Thank you! Wondering if you would like to add 13B with 4bit as well? We made it work with T4 and 15GB CPU RAM on HF Space. So theoretically it should work for Colab as well? 13B-4bit seems to be both faster and more accurate than 7B-8bit. (will get some numbers later)
Hi @haotian-liu 👋 Colab Free T4 has only 12.7 GB CPU RAM 😭 so I can only fit llava-v1.5-7b
with a max_shard_size='5GB'
model.
from llava.model.builder import load_pretrained_model
model_path = "liuhaotian/llava-v1.5-7b"
tokenizer, model, image_processor, context_len = load_pretrained_model(
model_path=model_path,
model_base=None,
model_name=model_path.split("/")[-1],
load_8bit=False,
load_4bit=False
)
out_folder = "/content/model8"
model.save_pretrained(out_folder, max_shard_size='5GB', safe_serialization=False)
tokenizer.save_pretrained(out_folder)
https://huggingface.co/4bit/llava-v1.5-7b-5GB/tree/main
I will try the same thing with liuhaotian/llava-v1.5-13b
@camenduru
Nuh it is possible. Not sure what is different, but I created a new one with 3gb-shard here: liuhaotian/llava-v1.5-13b-shard3gb.
Also, here is a minimal working example you can try: https://colab.research.google.com/drive/1aJBcR7aIV2i9EnKE5EA--XvwzIGiFugg?usp=sharing
🥳 wow cool 🔥 maybe transformers version idk 😋 I will add to the repo thanks ❤
@camenduru Thanks! Also featured the link to Colab on our README :)
thanks ❤
Hi all! Thank you for doing such amazing things!
Thanks for the project ❤️ I made a colab. 🥳 I hope you like it. https://github.com/camenduru/LLaVA-colab