ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 35+ MLLMs. (Qwen2, GLM4, Internlm2, Yi, Llama3, Llava, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
2.13k
stars
205
forks
source link
Phi3-V Can Not Disable FlashAttention #1209
Closed
airkid closed 4 days ago
Not work to set
--use_flash_attn false
https://github.com/modelscope/swift/blob/38e4d96bdab88f984b7f3bf8f94453f6ae63fac3/swift/llm/utils/model.py#L1255C24-L1255C59