Open tastelikefeet opened 2 weeks ago
qwen2-vl need transformers>=4.45.0.dev0,but swift need transformers<4.45.0, how to fixed it?
qwen2-vl need transformers>=4.45.0.dev0,but swift need transformers<4.45.0, how to fixed it?
After installing swift, pip install git+https://github.com/huggingface/transformers.git
Qwen2-VL seems to not compatible with FlashAttention? When I add "--use_flash_attn True", I encountered this error (CUDA_LAUNCH_BLOCKING was enabled to print the exact trace):
[rank0]: File "***/lib/python3.10/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 182, in apply_multimodal_rotary_pos_emb
[rank0]: cos = cos[position_ids]
[rank0]: RuntimeError: CUDA error: device-side assert triggered
[rank0]: Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
EDIT: There seems to be a problem of the multimodal rotary position embedding introduced by Qwen2 VL. Even turn off Flash Attention I still encounter this error
Qwen2-VL seems to not compatible with FlashAttention? When I add "--use_flash_attn True", I encountered this error (CUDA_LAUNCH_BLOCKING was enabled to print the exact trace):
[rank0]: File "***/lib/python3.10/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 182, in apply_multimodal_rotary_pos_emb [rank0]: cos = cos[position_ids] [rank0]: RuntimeError: CUDA error: device-side assert triggered [rank0]: Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.EDIT: There seems to be a problem of the multimodal rotary position embedding introduced by Qwen2 VL. Even turn off Flash Attention I still encounter this error
I do think this may be a bug in qwen2 code, I tried to fix by(modeling_qwen2_vl.py): This works
It did work, thanks.
Error occurs when i finetune with lora in V100:
RuntimeError: CUDA error: too many resources requested for launch CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with
TORCH_USE_CUDA_DSAto enable device-side assertions.
Could you provide the complete modeling_qwen2_vl.py file? I encountered an error while fine-tuning qwen2-vl-2b-instruct.
File "./miniconda3/envs/swift/lib/python3.11/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 296, in forward
[rank1]: attn_weights = torch.matmul(q, k.transpose(1, 2)) / math.sqrt(self.head_dim)
AttributeError: 'VisionAttention' object has no attribute 'head_dim'
Added an example of single-card A10 fine-tuning: https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Multi-Modal/qwen2-vl-best-practice.md#image-ocr-fine-tuning
我在A40上进行微调时,显存出现无限增长的问题导致CUDA out of memory CUDA_VISIBLE_DEVICES=3 swift sft \ --model_type qwen2-vl-7b-instruct \ --model_id_or_path qwen/Qwen2-VL-7B-Instruct \ --sft_type lora \ --dataset dataset.json 这是我使用的命令
https://github.com/modelscope/ms-swift/issues/1860
You can save memory by reducing SIZE_FACTOR=8 and MAX_PIXELS=602112.
Full parameter fine-tuning & freeze_vit support reference:
🎉The finetuning(VQA/OCR/Grounding/Video) for Qwen2-VL-Chat series models has been supported, please check the documentation below for details:
English
https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Multi-Modal/qwen2-vl-best-practice.md
Chinese
https://github.com/modelscope/ms-swift/blob/main/docs/source/Multi-Modal/qwen2-vl%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md