07/13 01:19:43 - mmengine - INFO - before_train in EvaluateChatHook.
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
rank0: Traceback (most recent call last):
rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 360, in
rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 356, in main
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1200, in train
rank0: model = self.train_loop.run() # type: ignore
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/loops.py", line 271, in run
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1271, in call_hook
rank0: getattr(hook, fn_name)(self, kwargs)
rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 234, in before_train
rank0: self._generate_samples(runner, max_new_tokens=50)
rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 220, in _generate_samples
rank0: self._eval_images(runner, model, device, max_new_tokens,
rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 152, in _eval_images
rank0: generation_output = model.generate(
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
rank0: return func(*args, *kwargs)
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate
rank0: result = self._sample(
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample
rank0: outputs = self(
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
rank0: return self._call_impl(args, kwargs)
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
rank0: return forward_call(*args, kwargs)
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward
rank0: output = module._old_forward(*args, *kwargs)
rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1204, in forward
rank0: outputs = self.model(
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
rank0: return self._call_impl(args, kwargs)
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
rank0: return forward_call(*args, *kwargs)
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward
rank0: output = module._old_forward(args, *kwargs)
rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 976, in forward
rank0: causal_mask = self._update_causal_mask(
rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1097, in _update_causal_mask
rank0: causal_mask = torch.arange(target_length, device=device) > cache_position.reshape(-1, 1)
rank0: RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3
[WARNING] using untested triton version (2.3.1), only 1.0.0 is known to be compatible
07/13 01:19:43 - mmengine - INFO - before_train in EvaluateChatHook. You are using an old version of the checkpointing format that is deprecated (We will also silently ignore
gradient_checkpointing_kwargs
in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method_set_gradient_checkpointing
in your model. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input'sattention_mask
to obtain reliable results. rank0: Traceback (most recent call last): rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 360, inrank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 356, in main
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1200, in train rank0: model = self.train_loop.run() # type: ignore rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/loops.py", line 271, in run
rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1271, in call_hook rank0: getattr(hook, fn_name)(self, kwargs) rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 234, in before_train rank0: self._generate_samples(runner, max_new_tokens=50) rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 220, in _generate_samples rank0: self._eval_images(runner, model, device, max_new_tokens, rank0: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 152, in _eval_images rank0: generation_output = model.generate( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank0: return func(*args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate rank0: result = self._sample( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample rank0: outputs = self( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward rank0: output = module._old_forward(*args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1204, in forward rank0: outputs = self.model( rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward rank0: output = module._old_forward(args, *kwargs) rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 976, in forward rank0: causal_mask = self._update_causal_mask( rank0: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1097, in _update_causal_mask rank0: causal_mask = torch.arange(target_length, device=device) > cache_position.reshape(-1, 1) rank0: RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0 [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3 [WARNING] using untested triton version (2.3.1), only 1.0.0 is known to be compatible