haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.12k stars 2.1k forks source link

[Usage] Unable to load LLaVA v1.6 models #1039

Open levi opened 7 months ago

levi commented 7 months ago

Describe the issue

Issue:

When trying to load liuhaotian/llava-v1.6-mistral-7b or liuhaotian/llava-v1.6-34b into my container:

MODEL_PATH = "liuhaotian/llava-v1.6-mistral-7b"
USE_8BIT = False
USE_4BIT = False
DEVICE = "cuda"

def download_llava_model():
    from llava.model.builder import load_pretrained_model
    from llava.mm_utils import get_model_name_from_path

    model_name = get_model_name_from_path(MODEL_PATH)
    load_pretrained_model(
        MODEL_PATH, None, model_name, USE_8BIT, USE_4BIT, device=DEVICE
    )

Seeing this error:

  File "/scripts/llava.py", line 23, in download_llava_model
    load_pretrained_model(
  File "/root/llava/llava/model/builder.py", line 151, in load_pretrained_model
    vision_tower.to(device=device, dtype=torch.float16)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 4 more times]
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!
Gutianpei commented 7 months ago

Same error. I think they have not updated the code for v1.6.

levi commented 7 months ago

For a brief moment I got it to build, but same error after reproing.

Gutianpei commented 7 months ago

For a brief moment I got it to build, but same error after reproing.

I made it working by updating pytorch/transformers to latest version and following this issue #1036

levi commented 7 months ago

Tried various pytorch versions, no luck. Installing from an empty container image and pip installing the repo like described in the readme.

ninatu commented 7 months ago

I have the same problem :(

levi commented 7 months ago

Bumping vram to 80GB resolved the issue for me. Possibly an OOM error in disguise?

ninatu commented 7 months ago

@levi thanks! That helped!

Pirog17000 commented 7 months ago
  File "A:\Utilities\DescriptingLLaVa\LLaVA\BatchCaptionFolder.py", line 35, in <module>
    tokenizer, model, image_processor, context_len = load_pretrained_model(
  File "A:\Utilities\DescriptingLLaVa\LLaVA\llava\model\builder.py", line 108, in load_pretrained_model
    model = LlavaMistralForCausalLM.from_pretrained(
NameError: name 'LlavaMistralForCausalLM' is not defined

I have this error, also launching 1.6 mistral7b and have no luck making it work Happens during the model load and initialization here:

from llava.model.builder import load_pretrained_model
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path, model_base, model_name, load_8bit=False, load_4bit=False,device_map='cuda:0', device='cuda:0')
Pirog17000 commented 7 months ago

Noticed a typo? image

Pirog17000 commented 7 months ago

If I comment out broken references in builder.py and replace it with direct imports as follow:

#from llava.model import *
from llava.model.language_model.llava_llama import LlavaLlamaForCausalLM
from llava.model.language_model.llava_mpt import LlavaMptForCausalLM as LlavaMPTForCausalLM
from llava.model.language_model.llava_mistral import LlavaMistralForCausalLM

then I got another issue:


  File "A:\Utilities\DescriptingLLaVa\LLaVA\llava\model\builder.py", line 23, in <module>
    from llava.model.language_model.llava_llama import LlavaLlamaForCausalLM
  File "A:\Utilities\DescriptingLLaVa\LLaVA\llava\model\language_model\llava_llama.py", line 21, in <module>
    from transformers import AutoConfig, AutoModelForCausalLM, \
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1373, in __getattr__
    value = getattr(module, name)
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1372, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1384, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
Failed to import transformers.integrations.peft because of the following error (look up to see its traceback):
DLL load failed while importing libtriton: The specified module could not be found.
rossgreer commented 7 months ago

Bumping vram to 80GB resolved the issue for me. Possibly an OOM error in disguise?

How does one "bump vram to 80GB"?

haotian-liu commented 7 months ago

You can inference with 4-bit quantization, which would fit the largest 34B variant in a 24GB GPU.

matankley commented 7 months ago

@haotian-liu I try to run the 4-bit 34B on 24GB Ram but I'm pretty sure it offloads some of the weights to cpu, because of low_cpu_mem_usage=True which results in the above error NotImplementedError: Cannot copy out of meta tensor; no data!

haotian-liu commented 7 months ago

@matankley

This is a demo loaded with 4-bit quantization on A10G (24G). Please check out the latest code base and retry, and if it does not work, please kindly share the commands you're using. THank you.

tonywang10101 commented 5 months ago

For me, vision_tower.is_loaded() wasn't functioning as anticipated. Manually executing vision_tower.load_model() resolved the issue.

hkfisherman commented 2 months ago

@haotian-liu Thanks for your great research! May I know, the download time too slow for me at here, if i want to save the model (15 .safetensors) to Google Drive - image

However, in google drive I am getting just 8 of 8 only (20GB), which has is half of the loaded model. Is that other save method for your LlavaLlamaForCausalLM ?