InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
https://xtuner.readthedocs.io/zh-cn/latest/
Apache License 2.0
4.01k stars 315 forks source link

chatglm3-6B 微调报错 #796

Open poisonwine opened 5 months ago

poisonwine commented 5 months ago

现在新版本xtuner增加了dispatch后,不支持chatglm3-6b的微调了吗 File "/mnt/afs/xtuner/xtuner/model/sft.py", line 93, in init dispatch_modules(self.llm, use_varlen_attn=use_varlen_attn) File "/mnt/afs/xtuner/xtuner/model/modules/dispatch/init.py", line 266, in dispatch_modules check(type(model).name) File "/mnt/afs/xtuner/xtuner/model/modules/dispatch/init.py", line 261, in check assert TRANSFORMERS_VERSION >= LOWEST_TRANSFORMERS_VERSION[ KeyError: 'ChatGLMForConditionalGeneration'

看了一下确实没有chatglm LOWEST_TRANSFORMERS_VERSION = dict( InternLM2ForCausalLM=digit_version('4.36'), InternLMForCausalLM=digit_version('4.36'), LlamaForCausalLM=digit_version('4.36'), Phi3ForCausalLM=digit_version('4.39'), MistralForCausalLM=digit_version('4.36'),

Training mixtral with lower version may lead to nccl timeout

# Refer to https://github.com/microsoft/DeepSpeed/issues/5066
MixtralForCausalLM=digit_version('4.40'),
CohereForCausalLM=digit_version('4.40'),
Qwen2ForCausalLM=digit_version('4.39'),
Qwen2MoeForCausalLM=digit_version('4.40'),
DeepseekV2ForCausalLM=digit_version('4.40'),

)

fanfanaqiuqiu commented 4 months ago

加一,同样的问题,无法微调chaglm3