Closed DanielProkhorov closed 4 months ago
Can you put full trace?
@storuky Sure, here it is:
Traceback (most recent call last):
File "InternLM-XComposer/finetune/finetune.py", line 311, in <module>
train()
File "InternLM-XComposer/finetune/finetune.py", line 251, in train
model.vit.resize_pos()
^^^^^^^^^
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1695, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'InternLMXComposerForCausalLM' object has no attribute 'vit'
Traceback (most recent call last):
File "InternLM-XComposer/finetune/finetune.py", line 311, in <module>
train()
File "InternLM-XComposer/finetune/finetune.py", line 251, in train
model.vit.resize_pos()
^^^^^^^^^
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1695, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'InternLMXComposerForCausalLM' object has no attribute 'vit'
[2024-02-15 16:38:19,094] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 1368756) of binary: miniconda3/envs/py311/bin/python
Traceback (most recent call last):
File "miniconda3/envs/py311/bin/torchrun", line 8, in <module>
sys.exit(main())
^^^^^^
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "miniconda3/envs/py311/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
finetune.py FAILED
+1
Just fixed it -- change internlm/internlm-xcomposer-vl-7b to internlm/internlm-xcomposer2-vl-7b
thanks @mrmuke !
I get this error, when trying to execute the
finetune_lora.sh
script