haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.29k stars 2.12k forks source link

🦒 colab #501

Closed camenduru closed 11 months ago

camenduru commented 11 months ago

Thanks for the project ❤️ I made a colab. 🥳 I hope you like it. https://github.com/camenduru/LLaVA-colab

AmmarFahmy commented 11 months ago

Great work!!! ... I just ran both Colab files. Works fine. But models are not downloading and not showing in the Gradio drop-down too.

camenduru commented 11 months ago

Thanks ❤ @AmmarFahmy please watch this https://www.youtube.com/watch?v=o7zQAa0NPds

AmmarFahmy commented 11 months ago

Thanks ❤ @AmmarFahmy please watch this https://www.youtube.com/watch?v=o7zQAa0NPds

Yes great .. worked fine.

haotian-liu commented 11 months ago

Thank you! Wondering if you would like to add 13B with 4bit as well? We made it work with T4 and 15GB CPU RAM on HF Space. So theoretically it should work for Colab as well? 13B-4bit seems to be both faster and more accurate than 7B-8bit. (will get some numbers later)

https://huggingface.co/spaces/badayvedat/LLaVA

camenduru commented 11 months ago

Hi @haotian-liu 👋 Colab Free T4 has only 12.7 GB CPU RAM 😭 so I can only fit llava-v1.5-7b with a max_shard_size='5GB' model.

from llava.model.builder import load_pretrained_model
model_path = "liuhaotian/llava-v1.5-7b"
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path=model_path,
    model_base=None,
    model_name=model_path.split("/")[-1],
    load_8bit=False,
    load_4bit=False
)
out_folder = "/content/model8"
model.save_pretrained(out_folder, max_shard_size='5GB', safe_serialization=False)
tokenizer.save_pretrained(out_folder)

https://huggingface.co/4bit/llava-v1.5-7b-5GB/tree/main

I will try the same thing with liuhaotian/llava-v1.5-13b

camenduru commented 11 months ago

Maybe not possible 😐

https://huggingface.co/4bit/llava-v1.5-13b-8bit/tree/main https://huggingface.co/4bit/llava-v1.5-13b-5GB/tree/main

Screenshot 2023-10-14 014828fdfd

haotian-liu commented 11 months ago

@camenduru

Nuh it is possible. Not sure what is different, but I created a new one with 3gb-shard here: liuhaotian/llava-v1.5-13b-shard3gb.

Also, here is a minimal working example you can try: https://colab.research.google.com/drive/1aJBcR7aIV2i9EnKE5EA--XvwzIGiFugg?usp=sharing

image

camenduru commented 11 months ago

🥳 wow cool 🔥 maybe transformers version idk 😋 I will add to the repo thanks ❤

haotian-liu commented 11 months ago

@camenduru Thanks! Also featured the link to Colab on our README :)

camenduru commented 11 months ago

thanks ❤

nickkolok commented 4 months ago

Hi all! Thank you for doing such amazing things!

  1. Any chances to get LLaVA 1.6 working on Colab?
  2. Is there a possibility to add an "Undo" button, as at https://huggingface.co/spaces/merve/llava-next ? That would be really helpful!
  3. How about context extension? Is that really possible?