InternLM / InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
2.06k stars 128 forks source link

微调使用zero3在载入vision_tower出现报错,当前是不支持zero3吗? #237

Open shaoyan1223 opened 3 months ago

shaoyan1223 commented 3 months ago

Traceback (most recent call last): File "/opt/tiger/internlm-xcomposer/finetune/finetune.py", line 311, in train() File "/opt/tiger/internlm-xcomposer/finetune/finetune.py", line 242, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/home/tiger/.local/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 479, in from_pretrained return model_class.from_pretrained( File "/home/tiger/.local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2675, in from_pretrained model = cls(config, *model_args, model_kwargs) File "/usr/local/lib/python3.9/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 385, in wrapper f(module, *args, *kwargs) File "/home/tiger/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/modeling_internlm_xcomposer2.py", line 67, in init self.vit = build_vision_tower() File "/home/tiger/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/build_mlp.py", line 12, in build_vision_tower return CLIPVisionTower(vision_tower) File "/usr/local/lib/python3.9/dist-packages/deepspeed/runtime/zero/partition_parameters.py", line 385, in wrapper f(module, args, kwargs) File "/home/tiger/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/build_mlp.py", line 60, in init self.resize_pos() File "/home/tiger/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2-vl-7b/build_mlp.py", line 91, in resize_pos pos_tokens = pos_tokens.reshape(-1, orig_size, orig_size,Traceback (most recent call last):

File "/opt/tiger/internlm-xcomposer/finetune/finetune.py", line 311, in RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 24, 24, 0] because the unspecified dimension size -1 can be any value and is ambiguous

yuhangzang commented 3 months ago

We didn't try zero-3 in our environment, so there may be some conflicts in the code.