RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0

CYing18 commented 3 months ago

07/13 01:19:43 - mmengine - INFO - before_train in EvaluateChatHook. You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. rank0: Traceback (most recent call last): rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 360, in

rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 356, in main

rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1200, in train rank0: model = self.train_loop.run() # type: ignore rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/loops.py", line 271, in run

rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1271, in call_hook rank0: getattr(hook, fn_name)(self, kwargs) rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 234, in before_train rank0: self._generate_samples(runner, max_new_tokens=50) rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 220, in _generate_samples rank0: self._eval_images(runner, model, device, max_new_tokens, rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 152, in _eval_images rank0: generation_output = model.generate( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank0: return func(*args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate rank0: result = self._sample( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample rank0: outputs = self( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward rank0: output = module._old_forward(*args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1204, in forward rank0: outputs = self.model( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward rank0: output = module._old_forward(args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 976, in forward rank0: causal_mask = self._update_causal_mask( rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1097, in _update_causal_mask rank0: causal_mask = torch.arange(target_length, device=device) > cache_position.reshape(-1, 1) rank0: RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0 [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3 [WARNING] using untested triton version (2.3.1), only 1.0.0 is known to be compatible

XiaoMan1117 commented 3 months ago

我也遇到了这个问题，请问您解决了吗？

XiaoMan1117 commented 3 months ago

@CYing18

TousenKaname commented 1 month ago

@CYing18

please update transformer package version :)

tiang2002 commented 1 month ago

i have got the same problem, did anyone fix it? Please let me know.

InternLM / xtuner

RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0 #833