magic-research / PLLaVA

Official repository for the paper PLLaVA
593 stars 40 forks source link

Error with Flash attention2 while testing the 34b demo #27

Closed zhangchunjie1999 closed 6 months ago

zhangchunjie1999 commented 6 months ago

When I input instructions to test the 34b model demo: bash scripts/demo.sh MODELS/pllava-34b MODELS/pllava-34b An error like the one below will occur: raceback (most recent call last): File "/data/miniconda3/envs/pllava/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/data/miniconda3/envs/pllava/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/group/40010/esmezhang/PLLaVA/tasks/eval/demo/pllava_demo.py", line 246, in <module> chat = init_model(args) File "/group/40010/esmezhang/PLLaVA/tasks/eval/demo/pllava_demo.py", line 29, in init_model model, processor = load_pllava( File "/group/40010/esmezhang/PLLaVA/tasks/eval/model_utils.py", line 53, in load_pllava model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=torch.bfloat16) File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3552, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/group/40010/esmezhang/PLLaVA/models/pllava/modeling_pllava.py", line 295, in __init__ self.language_model = AutoModelForCausalLM.from_config(config.text_config, torch_dtype=config.torch_dtype, attn_implementation="flash_attention_2") File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 437, in from_config return model_class._from_config(config, **kwargs) File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1385, in _from_config config = cls._autoset_attn_implementation( File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1454, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/data/miniconda3/envs/pllava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1557, in _check_and_enable_flash_attn_2 raise ImportError(f"{preface} Flash Attention 2 is not available. {install_message}") ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.

run transformers-cli env in the terminal and the output are:

`

How can I fix it?

zhangchunjie1999 commented 6 months ago

The hardware device is 2 pieces of 40g A100

gaowei724 commented 6 months ago

The hardware device is 2 pieces of 40g A100

Hi, how did you fix this?

zhangchunjie1999 commented 6 months ago

收到邮件了~   祝安!

gaowei724 commented 6 months ago

收到邮件了~   祝安!

Hi, have you solved the problem? I encounterd the same error when run demo.py. If you have solved the problem, could you please share your solution, thank you.

zhangchunjie1999 commented 6 months ago

已经解决了。你需要修改demo.sh,指定需要用的GPU。

---Original--- From: @.> Date: Sat, May 11, 2024 17:15 PM To: @.>; Cc: @.>;"State @.>; Subject: Re: [magic-research/PLLaVA] Error with Flash attention2 while testingthe 34b demo (Issue #27)

收到邮件了~   祝安!

Hi, have you solved the problem? I encounterd the same error when run demo.py. If you have solved the problem, could you please share your solution, thank you.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>

shauryat97 commented 3 months ago

@zhangchunjie1999, what modifications you made to the demo.sh file ?

zhangchunjie1999 commented 3 months ago

收到邮件了~   祝安!