Reminder

[X] I have read the README and searched the existing issues.

System Info

llamafactory version: 0.9.1.dev0
Platform: Linux-5.19.0-45-generic-x86_64-with-glibc2.35
Python version: 3.11.0
PyTorch version: 2.4.1+cu121 (GPU)
Transformers version: 4.44.2
Datasets version: 2.21.0
Accelerate version: 0.34.2
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: Quadro RTX 6000

Reproduction

参数

model

model_name_or_path: 】model/glm-4-9b-chat adapter_name_or_path: saves/glm49bchat/term/lora/sft/test

method

stage: sft do_predict: true finetuning_type: lora

dataset

eval_dataset: identity,alpaca_en_demo template: glm4 cutoff_len: 1024 max_samples: 50 overwrite_cache: False preprocessing_num_workers: 16

output

output_dir: score/glm49bchat/term/lora/sft/test overwrite_output_dir: true

eval

per_device_eval_batch_size: 1 predict_with_generate: true ddp_timeout: 180000000

错误信息 Loading checkpoint shards: 10%|████▌ | 1/10 [00:00<00:06, 1.47it/s]09/16/2024 12:38:24 - INFO - llamafactory.model.patcher - Using KV cache for faster generation. Loading checkpoint shards: 100%|████████████████████████████████████████████| 10/10 [00:06<00:00, 1.46it/s] [INFO|modeling_utils.py:4507] 2024-09-16 12:38:30,387 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[INFO|modeling_utils.py:4515] 2024-09-16 12:38:30,387 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /home/sunhao/glm-4-9b-chat. If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training. [INFO|configuration_utils.py:991] 2024-09-16 12:38:30,390 >> loading configuration file /home/sunhao/glm-4-9b-chat/generation_config.json [INFO|configuration_utils.py:1038] 2024-09-16 12:38:30,391 >> Generate config GenerationConfig { "do_sample": true, "eos_token_id": [ 151329, 151336, 151338 ], "max_length": 128000, "pad_token_id": 151329, "temperature": 0.8, "top_p": 0.8 }

09/16/2024 12:38:30 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 09/16/2024 12:38:30 - INFO - llamafactory.model.adapter - Merged 1 adapter(s). 09/16/2024 12:38:30 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/glm49bchat/term/lora/sft/test 09/16/2024 12:38:30 - INFO - llamafactory.model.loader - all params: 9,399,951,360 Loading checkpoint shards: 90%|████████████████████████████████████████▌ | 9/10 [00:06<00:00, 1.44it/s][INFO|trainer.py:3819] 2024-09-16 12:38:31,279 >> Running Prediction [INFO|trainer.py:3821] 2024-09-16 12:38:31,279 >> Num examples = 100 [INFO|trainer.py:3824] 2024-09-16 12:38:31,279 >> Batch size = 1 Loading checkpoint shards: 100%|████████████████████████████████████████████| 10/10 [00:06<00:00, 1.44it/s] 09/16/2024 12:38:31 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 09/16/2024 12:38:32 - INFO - llamafactory.model.adapter - Merged 1 adapter(s). 09/16/2024 12:38:32 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/glm49bchat/term/lora/sft/test 09/16/2024 12:38:32 - INFO - llamafactory.model.loader - all params: 9,399,951,360 rank0: Traceback (most recent call last): rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in

rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch

rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp rank0: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 117, in run_sft rank0: predict_results = trainer.predict(dataset_module["eval_dataset"], metric_key_prefix="predict", **gen_kwargs)

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer_seq2seq.py", line 244, in predict rank0: return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer.py", line 3744, in predict rank0: output = eval_loop(

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer.py", line 3857, in evaluation_loop rank0: losses, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)

rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 104, in prediction_step rank0: loss, generatedtokens, = super().prediction_step( # ignore the returned labels (may be truncated)

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer_seq2seq.py", line 310, in prediction_step rank0: generated_tokens = self.model.generate(generation_inputs, gen_kwargs)

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context rank0: return func(*args, **kwargs)

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/generation/utils.py", line 2024, in generate rank0: result = self._sample(

rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/generation/utils.py", line 3032, in _sample rank0: model_kwargs = self._update_model_kwargs_for_generation(

rank0: File "/home/sunhao/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 812, in _update_model_kwargs_for_generation rank0: model_kwargs["past_key_values"] = self._extract_past_from_model_output(

api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-09-16_12:38:33 host : lab402 rank : 0 (local_rank: 0) exitcode : 1 (pid: 1022547) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ### Expected behavior _No response_ ### Others _No response_

hiyouga / LLaMA-Factory

对微调后的GLM-4-9B-Chat运行examples/train_lora/llama3_lora_predict.yaml出错 #5447