modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.43k stars 389 forks source link

AttributeError: module 'transformers_modules.InternVL2-2B-1epoch.tokenization_internlm2' has no attribute 'InternLM2Tokenizer' #1663

Open guihonghao opened 3 months ago

guihonghao commented 3 months ago

torchrun \ --nnodes $ARNOLD_WORKER_NUM \ --node_rank $ARNOLD_ID \ --master_addr $METIS_WORKER_0_HOST \ --nproc_per_node $ARNOLD_WORKER_GPU \ --master_port $port \ examples/pytorch/llm/llm_sft.py \ --model_type 'internvl2-2b' \ --model_id_or_path $BASE_PATH/playground/models/InternVL2-2B-1epoch \ --sft_type 'lora' \ --tuner_backend 'peft' \ --template_type 'AUTO' \ --dtype 'AUTO' \ 用上面的训练脚本跑InternVL2-2B的模型训练会爆下面的错误是怎么回事?怎么解决?

AttributeError: module 'transformers_modules.InternVL2-2B-1epoch.tokenization_internlm2' has no attribute 'InternLM2Tokenizer' tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 500, in get_class_from_dynamic_module return get_class_in_module(class_name, final_module.replace(".py", "")) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 201, in get_class_in_module return getattr(module, class_name) AttributeError: module 'transformers_modules.InternVL2-2B-1epoch.tokenization_internlm2' has no attribute 'InternLM2Tokenizer'

tastelikefeet commented 3 months ago

稳定复现吗?这个错误之前有人报过了,但我们一直不好复现

guihonghao commented 3 months ago

多机多卡的时候会报这种错误。尝试降级到transformers==4.37.2后还是会报错。

guihonghao commented 3 months ago

Traceback (most recent call last): File "/mnt/bn/ghh-test/code/swift/examples/pytorch/llm/llm_sft.py", line 10, in output = sft_main() File "/mnt/bn/ghh-test/code/swift/swift/utils/run_utils.py", line 32, in x_main result = llm_x(args, kwargs) File "/mnt/bn/ghh-test/code/swift/swift/llm/sft.py", line 215, in llm_sft model, tokenizer = get_model_tokenizer( File "/mnt/bn/ghh-test/code/swift/swift/llm/utils/model.py", line 6341, in get_model_tokenizer model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, kwargs) File "/mnt/bn/ghh-test/code/swift/swift/llm/utils/model.py", line 5854, in get_model_tokenizer_minicpm_v_2_x processor = AutoProcessor.from_pretrained(model_dir, trust_remote_code=True) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/models/auto/processing_auto.py", line 309, in from_pretrained return processor_class.from_pretrained( File "/home/tiger/.local/lib/python3.9/site-packages/transformers/processing_utils.py", line 466, in from_pretrained args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, kwargs) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/processing_utils.py", line 512, in _get_arguments_from_pretrained args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, kwargs)) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 797, in from_pretrained tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 500, in get_class_from_dynamic_module return get_class_in_module(class_name, final_module.replace(".py", "")) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 201, in get_class_in_module return getattr(module, class_name) AttributeError: module 'transformers_modules.MiniCPM-V-2_6.tokenization_minicpmv_fast' has no attribute 'MiniCPMVTokenizerFast'

MiniCPM-V-2_6也会报这个错误

guihonghao commented 3 months ago

这个问题有解决方案吗?多机多卡总是报错(10次里面9次报这个错误,1次能成功),每次报错崩了就得重新排队了。

guihonghao commented 2 months ago

这个问题还是没有解决哦。总是报AttributeError: module 'transformers_modules.InternVL2-2B-1epoch.tokenization_internlm2' has no attribute 'InternLM2Tokenizer'这种错误。就算把模型的名称改成InternVL2-2B同名,也会报错。

rushzy commented 1 month ago

一样的错误,单机多卡的时候有概率会出现

lyj798444739 commented 3 weeks ago

多机多卡会报这个错误 有办法解决吗

lyj798444739 commented 3 weeks ago

这个问题还是没有解决哦。总是报AttributeError: module 'transformers_modules.InternVL2-2B-1epoch.tokenization_internlm2' has no attribute 'InternLM2Tokenizer'这种错误。就算把模型的名称改成InternVL2-2B同名,也会报错。

大佬解决了吗