microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Other
1.48k stars 181 forks source link

ValueError: weight is on the meta device, we need a `value` to put in on 0. #94

Open Mike-ihr opened 3 weeks ago

Mike-ihr commented 3 weeks ago

When I run the cli.py, there is a bug:

Traceback (most recent call last): File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/media/ubuntu/data/jixin/LLaVA-Med/llava/serve/cli.py", line 125, in main(args) File "/media/ubuntu/data/jixin/LLaVA-Med/llava/serve/cli.py", line 32, in main tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/media/ubuntu/data/jixin/LLaVA-Med/llava/model/builder.py", line 33, in load_pretrained_model model = LlavaMistralForCausalLM.from_pretrained( File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3773, in from_pretrained dispatch_model(model, **device_map_kwargs) File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/site-packages/accelerate/big_modeling.py", line 371, in dispatch_model attach_align_device_hook_on_blocks( File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/site-packages/accelerate/hooks.py", line 506, in attach_align_device_hook_on_blocks add_hook_to_module(module, hook) File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/site-packages/accelerate/hooks.py", line 155, in add_hook_to_module module = hook.init_hook(module) File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/site-packages/accelerate/hooks.py", line 253, in init_hook set_module_tensor_to_device(module, name, self.execution_device) File "/media/ubuntu/data/jixin/anaconda3/envs/llava-med/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 267, in set_module_tensor_to_device raise ValueError(f"{tensor_name} is on the meta device, we need a value to put in on {device}.") ValueError: weight is on the meta device, we need a value to put in on 0.

And I find that there is a bug from codes in builder.py, model = LlavaMistralForCausalLM.from_pretrained will encounter an error if 'llava' in model_name.lower():

Load LLaVA model

        if 'mistral' in model_name.lower():
            tokenizer = AutoTokenizer.from_pretrained(model_path)
            model = LlavaMistralForCausalLM.from_pretrained(
                model_path,
                low_cpu_mem_usage=False,
                use_flash_attention_2=False,
                **kwargs
            )

How can I solve this problem?