错误信息
Loading checkpoint shards: 10%|████▌ | 1/10 [00:00<00:06, 1.47it/s]09/16/2024 12:38:24 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
Loading checkpoint shards: 100%|████████████████████████████████████████████| 10/10 [00:06<00:00, 1.46it/s]
[INFO|modeling_utils.py:4507] 2024-09-16 12:38:30,387 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[INFO|modeling_utils.py:4515] 2024-09-16 12:38:30,387 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /home/sunhao/glm-4-9b-chat.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training.
[INFO|configuration_utils.py:991] 2024-09-16 12:38:30,390 >> loading configuration file /home/sunhao/glm-4-9b-chat/generation_config.json
[INFO|configuration_utils.py:1038] 2024-09-16 12:38:30,391 >> Generate config GenerationConfig {
"do_sample": true,
"eos_token_id": [
151329,
151336,
151338
],
"max_length": 128000,
"pad_token_id": 151329,
"temperature": 0.8,
"top_p": 0.8
}
09/16/2024 12:38:30 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
09/16/2024 12:38:30 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
09/16/2024 12:38:30 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/glm49bchat/term/lora/sft/test
09/16/2024 12:38:30 - INFO - llamafactory.model.loader - all params: 9,399,951,360
Loading checkpoint shards: 90%|████████████████████████████████████████▌ | 9/10 [00:06<00:00, 1.44it/s][INFO|trainer.py:3819] 2024-09-16 12:38:31,279 >>
Running Prediction
[INFO|trainer.py:3821] 2024-09-16 12:38:31,279 >> Num examples = 100
[INFO|trainer.py:3824] 2024-09-16 12:38:31,279 >> Batch size = 1
Loading checkpoint shards: 100%|████████████████████████████████████████████| 10/10 [00:06<00:00, 1.44it/s]
09/16/2024 12:38:31 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
09/16/2024 12:38:32 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
09/16/2024 12:38:32 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/glm49bchat/term/lora/sft/test
09/16/2024 12:38:32 - INFO - llamafactory.model.loader - all params: 9,399,951,360
rank0: Traceback (most recent call last):
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
rank0: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 117, in run_sft
rank0: predict_results = trainer.predict(dataset_module["eval_dataset"], metric_key_prefix="predict", **gen_kwargs)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer_seq2seq.py", line 244, in predict
rank0: return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer.py", line 3744, in predict
rank0: output = eval_loop(
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer.py", line 3857, in evaluation_loop
rank0: losses, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 104, in prediction_step
rank0: loss, generatedtokens, = super().prediction_step( # ignore the returned labels (may be truncated)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer_seq2seq.py", line 310, in prediction_step
rank0: generated_tokens = self.model.generate(generation_inputs, gen_kwargs)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
rank0: return func(*args, **kwargs)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/generation/utils.py", line 2024, in generate
rank0: result = self._sample(
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/generation/utils.py", line 3032, in _sample
rank0: model_kwargs = self._update_model_kwargs_for_generation(
rank0: File "/home/sunhao/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 812, in _update_model_kwargs_for_generation
rank0: model_kwargs["past_key_values"] = self._extract_past_from_model_output(
rank0: TypeError: GenerationMixin._extract_past_from_model_output() got an unexpected keyword argument 'standardize_cache_format'
W0916 12:38:33.499000 140091173263168 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 1022548 closing signal SIGTERM
E0916 12:38:33.914000 140091173263168 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 0 (pid: 1022547) of binary: /home/sunhao/anaconda3/envs/LLaMA-Factory/bin/python
Traceback (most recent call last):
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/bin/torchrun", line 8, in
sys.exit(main())
^^^^^^
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 348, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/run.py", line 901, in main
run(args)
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in run
elastic_launch(
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 133, in call
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Reminder
System Info
llamafactory
version: 0.9.1.dev0Reproduction
参数
model
model_name_or_path: 】model/glm-4-9b-chat adapter_name_or_path: saves/glm49bchat/term/lora/sft/test
method
stage: sft do_predict: true finetuning_type: lora
dataset
eval_dataset: identity,alpaca_en_demo template: glm4 cutoff_len: 1024 max_samples: 50 overwrite_cache: False preprocessing_num_workers: 16
output
output_dir: score/glm49bchat/term/lora/sft/test overwrite_output_dir: true
eval
per_device_eval_batch_size: 1 predict_with_generate: true ddp_timeout: 180000000
错误信息 Loading checkpoint shards: 10%|████▌ | 1/10 [00:00<00:06, 1.47it/s]09/16/2024 12:38:24 - INFO - llamafactory.model.patcher - Using KV cache for faster generation. Loading checkpoint shards: 100%|████████████████████████████████████████████| 10/10 [00:06<00:00, 1.46it/s] [INFO|modeling_utils.py:4507] 2024-09-16 12:38:30,387 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.
[INFO|modeling_utils.py:4515] 2024-09-16 12:38:30,387 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /home/sunhao/glm-4-9b-chat. If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training. [INFO|configuration_utils.py:991] 2024-09-16 12:38:30,390 >> loading configuration file /home/sunhao/glm-4-9b-chat/generation_config.json [INFO|configuration_utils.py:1038] 2024-09-16 12:38:30,391 >> Generate config GenerationConfig { "do_sample": true, "eos_token_id": [ 151329, 151336, 151338 ], "max_length": 128000, "pad_token_id": 151329, "temperature": 0.8, "top_p": 0.8 }
09/16/2024 12:38:30 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 09/16/2024 12:38:30 - INFO - llamafactory.model.adapter - Merged 1 adapter(s). 09/16/2024 12:38:30 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/glm49bchat/term/lora/sft/test 09/16/2024 12:38:30 - INFO - llamafactory.model.loader - all params: 9,399,951,360 Loading checkpoint shards: 90%|████████████████████████████████████████▌ | 9/10 [00:06<00:00, 1.44it/s][INFO|trainer.py:3819] 2024-09-16 12:38:31,279 >> Running Prediction [INFO|trainer.py:3821] 2024-09-16 12:38:31,279 >> Num examples = 100 [INFO|trainer.py:3824] 2024-09-16 12:38:31,279 >> Batch size = 1 Loading checkpoint shards: 100%|████████████████████████████████████████████| 10/10 [00:06<00:00, 1.44it/s] 09/16/2024 12:38:31 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. 09/16/2024 12:38:32 - INFO - llamafactory.model.adapter - Merged 1 adapter(s). 09/16/2024 12:38:32 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/glm49bchat/term/lora/sft/test 09/16/2024 12:38:32 - INFO - llamafactory.model.loader - all params: 9,399,951,360 rank0: Traceback (most recent call last): rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp rank0: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 117, in run_sft rank0: predict_results = trainer.predict(dataset_module["eval_dataset"], metric_key_prefix="predict", **gen_kwargs)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer_seq2seq.py", line 244, in predict rank0: return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer.py", line 3744, in predict rank0: output = eval_loop(
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer.py", line 3857, in evaluation_loop rank0: losses, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
rank0: File "/home/sunhao/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 104, in prediction_step rank0: loss, generatedtokens, = super().prediction_step( # ignore the returned labels (may be truncated)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/trainer_seq2seq.py", line 310, in prediction_step rank0: generated_tokens = self.model.generate(generation_inputs, gen_kwargs)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context rank0: return func(*args, **kwargs)
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/generation/utils.py", line 2024, in generate rank0: result = self._sample(
rank0: File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/transformers/generation/utils.py", line 3032, in _sample rank0: model_kwargs = self._update_model_kwargs_for_generation(
rank0: File "/home/sunhao/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 812, in _update_model_kwargs_for_generation rank0: model_kwargs["past_key_values"] = self._extract_past_from_model_output(
rank0: TypeError: GenerationMixin._extract_past_from_model_output() got an unexpected keyword argument 'standardize_cache_format' W0916 12:38:33.499000 140091173263168 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 1022548 closing signal SIGTERM E0916 12:38:33.914000 140091173263168 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 0 (pid: 1022547) of binary: /home/sunhao/anaconda3/envs/LLaMA-Factory/bin/python Traceback (most recent call last): File "/home/sunhao/anaconda3/envs/LLaMA-Factory/bin/torchrun", line 8, in
sys.exit(main())
^^^^^^
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 348, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/run.py", line 901, in main
run(args)
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in run
elastic_launch(
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 133, in call
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sunhao/anaconda3/envs/LLaMA-Factory/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
/home/sunhao/LLaMA-Factory/src/llamafactory/launcher.py FAILED
Failures: