Open hunterhome opened 4 months ago
Reminder
- [x] I have read the README and searched the existing issues.
System Info
[2024-06-07 10:17:14,980] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
llamafactory
version: 0.7.2.dev0- Platform: Linux-5.10.0-198.0.0.111.oe2203sp3.aarch64-aarch64-with-glibc2.34
- Python version: 3.10.14
- PyTorch version: 2.2.0 (NPU)
- Transformers version: 4.41.2
- Datasets version: 2.19.2
- Accelerate version: 0.30.1
- PEFT version: 0.11.1
- TRL version: 0.9.3
- NPU type: Ascend910B2
- CANN version: 8.0.RC2.alpha001
- DeepSpeed version: 0.13.2
Reproduction
llamafactory-cli train \ --stage ppo \ --do_train True \ --model_name_or_path ZhipuAI/glm-4-9b-chat \ --preprocessing_num_workers 16 \ --finetuning_type lora \ --template glm4 \ --flash_attn auto \ --dataset_dir data \ --dataset disc-law-sft-triplet \ --cutoff_len 8192 \ --learning_rate 5e-05 \ --num_train_epochs 3.0 \ --max_samples 100000 \ --per_device_train_batch_size 1 \ --gradient_accumulation_steps 8 \ --lr_scheduler_type cosine \ --max_grad_norm 1.0 \ --logging_steps 5 \ --save_steps 100 \ --warmup_steps 0 \ --optim adamw_torch \ --packing False \ --report_to none \ --output_dir saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-44-37 \ --bf16 True \ --plot_loss True \ --ddp_timeout 180000000 \ --include_num_input_tokens_seen True \ --adapter_name_or_path saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 \ --lora_rank 8 \ --lora_alpha 16 \ --lora_dropout 0 \ --lora_target all \ --reward_model saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 \ --reward_model_type lora \ --ppo_score_norm True \ --top_k 0 \ --top_p 0.9 ### Expected behavior _No response_ ### Others [2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] [2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] ***************************************** [2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. [2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] ***************************************** [2024-06-07 10:11:03,623] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,661] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,705] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,818] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,836] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,905] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,955] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) [2024-06-07 10:11:03,991] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect) 06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 0, device: npu:0, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 2024-06-07 10:11:17,434 - modelscope - INFO - PyTorch version 2.2.0 Found. 2024-06-07 10:11:17,436 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer 2024-06-07 10:11:17,490 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 ceb78a2ac746b5506819a47dbbf0e37c and a total number of 976 components indexed 06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 7, device: npu:7, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 4, device: npu:4, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 6, device: npu:6, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 2, device: npu:2, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 1, device: npu:1, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 06/07/2024 10:11:18 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:18 - INFO - llamafactory.hparams.parser - Process rank: 5, device: npu:5, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 [INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,235 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,235 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file tokenizer_config.json [INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file tokenizer.json 06/07/2024 10:11:18 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training. 06/07/2024 10:11:18 - INFO - llamafactory.hparams.parser - Process rank: 3, device: npu:3, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16 [WARNING|logging.py:314] 2024-06-07 10:11:19,288 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. 06/07/2024 10:11:19 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 06/07/2024 10:11:22 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words. 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... 06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json... Running tokenizer on dataset (num_proc=16): 100%|█████████████████████████████████████████████████████████████████| 16000/16000 [00:38<00:00, 416.91 examples/s] input_ids: [151331, 151333, 151336, 198, 100698, 103309, 101138, 3837, 113094, 110590, 105177, 99312, 8994, 98379, 106170, 117921, 3837, 98546, 20, 98334, 21, 98424, 99146, 98385, 99082, 117225, 3837, 108592, 98696, 105181, 103757, 117537, 98380, 99043, 100451, 102337, 103273, 106156, 118828, 98798, 105181, 101376, 98314, 117055, 98550, 109534, 3837, 98459, 101247, 105079, 98634, 123900, 98324, 117537, 98595, 101676, 111602, 99916, 98760, 101642, 98335, 3837, 108592, 98696, 105181, 98453, 105529, 109290, 98396, 98381, 103941, 98798, 105181, 99195, 118894, 3837, 103078, 98711, 109534, 105079, 98322, 107801, 98993, 114731, 100129, 101242, 3837, 98547, 110664, 99999, 105181, 109487, 98365, 3837, 108592, 98696, 105181, 98701, 107801, 98993, 114731, 103941, 98798, 105181, 98314, 99527, 113995, 3837, 99704, 124187, 116767, 101806, 98583, 109695, 98829, 110960, 99416, 121952, 109055, 112246, 117442, 101242, 3837, 117442, 101242, 100048, 98875, 121424, 99054, 99893, 98649, 105862, 98433, 112998, 99108, 120250, 106318, 100035, 1773, 98365, 98379, 118828, 98798, 105181, 105420, 3837, 101113, 99131, 100588, 98634, 100059, 98493, 108592, 98696, 105181, 98607, 103278, 98344, 98817, 1773, 98379, 103171, 3837, 109534, 108634, 99532, 102492, 20, 11, 124206, 13, 24, 98575, 3837, 109055, 108634, 99532, 102492, 16, 11, 19, 101474, 13, 102486, 98575, 3837, 117442, 101242, 108634, 99532, 102492, 17, 11, 24, 99951, 13, 99082, 98575, 3837, 99054, 99893, 98649, 106508, 99108, 120250, 108634, 99532, 102492, 24, 11, 102114, 21, 98575, 3837, 111086, 101832, 99532, 106234, 102492, 98729, 11, 101135, 17, 13, 21, 98575, 1773, 101409, 100867, 3837, 108592, 98696, 105181, 98319, 119626, 98322, 100297, 98479, 110416, 3837, 118828, 98798, 105181, 5373, 100547, 105181, 5373, 104464, 105181, 110065, 3837, 110664, 99999, 105181, 98314, 98697, 98856, 3837, 100059, 111413, 99565, 98990, 3837, 116550, 99304, 3837, 103171, 102622, 98560, 3837, 108592, 98696, 105181, 98314, 127251, 98381, 102070, 98539, 98404, 102243, 105483, 3837, 106144, 102919, 1773, 151337] inputs: [gMASK] <sop> <|user|> 基于下列案件,推测可能的判决结果。 经审理查明,2015年6月21日15时许,被告人白某某在大东区小河沿公交车站乘坐被害人张某某驾驶的133路公交车,当车辆行驶至沈阳市大东区东陵西路26号附近时,被告人白某某因未能下车而与司机张某某发生争执,并在该公交车行驶中用手拉拽档杆,被证人韩某某拉开后,被告人白某某又用手拉拽司机张某某的右胳膊,导致该车失控撞向右侧马路边停放的轿车和一个路灯杆,路灯杆折断后将福锅记炖品店的牌匾砸坏。后经被害人张某某报警,公安人员赶至现场将被告人白某某传唤到案。经鉴定,公交车受损价值人民币5,189.9元,轿车受损价值人民币1,449.57元,路灯杆受损价值人民币2,927.15元,福锅记饭店牌匾受损价值人民币9,776元,本案损失价值共计人民币19,342.6元。上述事实,被告人白某某在庭审中亦无异议,被害人张某某、朱某某、詹某某陈述,证人韩某某的证言,现场勘察笔录,视听资料,鉴定结论书,被告人白某某的供述与辩解等证据证实,足以认定。 <|assistant|> [INFO|configuration_utils.py:731] 2024-06-07 10:12:08,107 >> loading configuration file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/config.json [INFO|configuration_utils.py:731] 2024-06-07 10:12:08,110 >> loading configuration file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/config.json [INFO|configuration_utils.py:796] 2024-06-07 10:12:08,111 >> Model config ChatGLMConfig { "_name_or_path": "/root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat", "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification" }, "bias_dropout_fusion": true, "classifier_dropout": null, "eos_token_id": [ 151329, 151336, 151338 ], "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1.5625e-07, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_hidden_layers": 40, "num_layers": 40, "original_rope": true, "pad_token_id": 151329, "padded_vocab_size": 151552, "post_layer_norm": true, "rmsnorm": true, "rope_ratio": 500, "seq_length": 131072, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.41.2", "use_cache": true, "vocab_size": 151552 } [INFO|modeling_utils.py:3471] 2024-06-07 10:12:08,159 >> loading weights file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/model.safetensors.index.json [INFO|modeling_utils.py:1519] 2024-06-07 10:12:08,160 >> Instantiating ChatGLMForConditionalGeneration model under default dtype torch.bfloat16. [INFO|configuration_utils.py:962] 2024-06-07 10:12:08,162 >> Generate config GenerationConfig { "eos_token_id": [ 151329, 151336, 151338 ], "pad_token_id": 151329 } Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:06<00:00, 1.45it/s] [INFO|modeling_utils.py:4280] 2024-06-07 10:12:15,224 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration. [INFO|modeling_utils.py:4288] 2024-06-07 10:12:15,224 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat. If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training. [INFO|modeling_utils.py:3797] 2024-06-07 10:12:15,231 >> Generation config file not found, using a generation config created from the model config. 06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA Loading checkpoint shards: 60%|██████████████████████████████████████████████████████████▏ | 6/10 [00:04<00:02, 1.35it/s]06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:15 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 Loading checkpoint shards: 70%|███████████████████████████████████████████████████████████████████▉ | 7/10 [00:05<00:02, 1.39it/s]06/07/2024 10:12:16 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:06<00:00, 1.51it/s] 06/07/2024 10:12:17 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:17 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:17 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:17 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.42it/s] 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.36it/s] 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.35it/s] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.34it/s] 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA Loading checkpoint shards: 90%|███████████████████████████████████████████████████████████████████████████████████████▎ | 9/10 [00:07<00:00, 1.19it/s]06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:18 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:08<00:00, 1.19it/s] 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA 06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:19 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 06/07/2024 10:12:19 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 Loading checkpoint shards: 90%|███████████████████████████████████████████████████████████████████████████████████████▎ | 9/10 [00:09<00:01, 1.02s/it]06/07/2024 10:12:20 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 06/07/2024 10:12:20 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:20 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:20 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 06/07/2024 10:12:21 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:10<00:00, 1.05s/it] 06/07/2024 10:12:21 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled. 06/07/2024 10:12:21 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 06/07/2024 10:12:21 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32. 06/07/2024 10:12:21 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA 06/07/2024 10:12:22 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 06/07/2024 10:12:22 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files.. 06/07/2024 10:12:22 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model. 06/07/2024 10:12:22 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248 06/07/2024 10:12:23 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - ***** Running training ***** 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Num examples = 16000 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Num Epochs = 3.0 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Instantaneous batch size per device = 1 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Total train batch size (w. parallel, buffer, distributed & accumulation) = 64 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Gradient Accumulation steps = 8 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Num optimization epochs per batch = 4 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Total training steps = 750 06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - Number of trainable parameters = 21180417 0%| | 0/750 [00:00<?, ?it/s]/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) /data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.) scores_processed = torch.where(scores != scores, 0.0, scores) [rank1]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank0]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank7]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank2]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank6]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank3]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank4]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) [rank5]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) 0%| | 0/750 [00:14<?, ?it/s] Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 213 is out of bounds for dimension 1 with size 1 Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 67 is out of bounds for dimension 1 with size 1 Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 379 is out of bounds for dimension 1 with size 1 Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 390 is out of bounds for dimension 1 with size 1 ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 408 is out of bounds for dimension 1 with size 1 Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 499 is out of bounds for dimension 1 with size 1 Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 501 is out of bounds for dimension 1 with size 1 Traceback (most recent call last): File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module> launch() File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch run_exp() File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards rewards.append(values[i, end_index].float().detach().cpu()) # use fp32 type IndexError: index 488 is out of bounds for dimension 1 with size 1 [2024-06-07 10:12:46,085] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227860 closing signal SIGTERM [2024-06-07 10:12:46,085] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227861 closing signal SIGTERM [2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227862 closing signal SIGTERM [2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227863 closing signal SIGTERM [2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227864 closing signal SIGTERM [2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227865 closing signal SIGTERM [2024-06-07 10:12:46,451] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2227858) of binary: /data/anaconda3/envs/llama_factory/bin/python Traceback (most recent call last): File "/data/anaconda3/envs/llama_factory/bin/torchrun", line 8, in <module> sys.exit(main()) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper return f(*args, **kwargs) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main run(args) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run elastic_launch( File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ /data/LLaMA-Factory/src/llamafactory/launcher.py FAILED ------------------------------------------------------------ Failures: [1]: time : 2024-06-07_10:12:46 host : localhost.localdomain rank : 1 (local_rank: 1) exitcode : 1 (pid: 2227859) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-06-07_10:12:46 host : localhost.localdomain rank : 0 (local_rank: 0) exitcode : 1 (pid: 2227858) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html =========================================================== 老哥,你好,昇腾的glm4微调训练成功了么,可不可以提供你的脚本
大佬,有没有推理脚本
请问大家有使用昇腾进行glm4微调训练成功的吗,我这一直报 The param dtype not implemented for DT_BFLOAT16, should be in dtype support list [DT_FLOAT16,DT_FLOAT,DT_DOUBLE,DT_INT8,DT_UINT8,DT_INT16,DT_INT32,DT_INT64,DT_BOOL,DT_COMPLEX64,DT_COMPLEX128,] 有没有大佬能解决这个问题
同样想要glm4的finetune脚本,推荐的配置参数命令也行,感谢
Reminder
System Info
[2024-06-07 10:17:14,980] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
llamafactory
version: 0.7.2.dev0Reproduction