QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

【运行微调脚本报错】You are using an old version of the checkpointing format that is deprecated #1143

Closed Maojianzeng closed 5 months ago

Maojianzeng commented 6 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

python finetune.py --model_name_or_path /mnt/public_data/models/modelscope/hub/Qwen/Qwen-7B-Chat --data_path chat.json --fp16 True --output_dir output_qwen --num_train_epochs 5 --per_device_train_batch_size 2 --per_device_eval_batch_size 1 --gradient_accumulation_steps 8 --evaluation_strategy "no" --save_strategy "steps" --save_steps 1000 --save_total_limit 10 --learning_rate 3e-4 --weight_decay 0.1 --adam_beta2 0.95 --warmup_ratio 0.01 --lr_scheduler_type "cosine" --logging_steps 1 --report_to "none" --model_max_length 512 --lazy_preprocess True --gradient_checkpointing --use_lora [2024-03-12 17:25:25,410] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained". Try importing flash-attention for faster inference... Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 8.84it/s] trainable params: 143,130,624 || all params: 7,864,455,168 || trainable%: 1.8199687192876373 Loading data... Formatting inputs...Skip in lazy mode You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model. 0%| | 0/5 [00:00<?, ?it/s]Traceback (most recent call last): File "/mnt/public_data/LLMfinetune/Qwen/finetune.py", line 374, in train() File "/mnt/public_data/LLMfinetune/Qwen/finetune.py", line 367, in train trainer.train() File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train return inner_training_loop( File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/trainer.py", line 1821, in _inner_training_loop for step, inputs in enumerate(epoch_iterator): File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/accelerate/data_loader.py", line 448, in iter current_batch = next(dataloader_iter) File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next data = self._next_data() File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/maojianzeng/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/mnt/public_data/LLMfinetune/Qwen/finetune.py", line 224, in getitem ret = preprocess([self.raw_data[i]["conversations"]], self.tokenizer, self.max_len) KeyError: 'conversations' 0%|

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS: Ubuntu 20.04
- Python:3.10.13
- Transformers:4.36.2
- PyTorch:2.1.2
- CUDA:12.2

备注 | Anything else?

No response

jklj077 commented 6 months ago

You can ignore the warning.

The error is raised because of incorrect data format.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。