[Question]: 使用text classification的微调训练显示KeyError: 'eval_accuracy'

请提出你的问题

使用的微调语句是

python train.py \
    --do_train \
    --do_export \
    --model_name_or_path ernie-3.0-tiny-medium-v2-zh \
    --output_dir checkpoint \
    --device gpu \
    --num_train_epochs 100 \
    --early_stopping True \
    --early_stopping_patience 5 \
    --learning_rate 3e-5 \
    --max_length 128 \
    --per_device_eval_batch_size 32 \
    --per_device_train_batch_size 32 \
    --metric_for_best_model accuracy \
    --load_best_model_at_end \
    --logging_steps 40 \
    --evaluation_strategy epoch\
    --save_strategy epoch \
    --save_total_limit 1 \
    --train_path '/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/train.txt' \
    --dev_path  '/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/dev.txt' \
    --test_path '/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/dev.txt' \
    --label_path '/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/label.txt'

返回的结果

/root/anaconda3/envs/paddlepaddle/lib/python3.9/site-packages/_distutils_hack/__init__.py:26: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")
[2024-06-07 14:18:04,964] [    INFO] - The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
[2024-06-07 14:18:04,965] [    INFO] - ============================================================
[2024-06-07 14:18:04,965] [    INFO] -      Model Configuration Arguments      
[2024-06-07 14:18:04,965] [    INFO] - paddle commit id              :41ba14f30600373df53839dbf763405cfacb5c92
[2024-06-07 14:18:04,965] [    INFO] - export_model_dir              :None
[2024-06-07 14:18:04,965] [    INFO] - model_name_or_path            :ernie-3.0-tiny-medium-v2-zh
[2024-06-07 14:18:04,965] [    INFO] - 
[2024-06-07 14:18:04,965] [    INFO] - ============================================================
[2024-06-07 14:18:04,965] [    INFO] -       Data Configuration Arguments      
[2024-06-07 14:18:04,965] [    INFO] - paddle commit id              :41ba14f30600373df53839dbf763405cfacb5c92
[2024-06-07 14:18:04,965] [    INFO] - bad_case_path                 :./data/bad_case.txt
[2024-06-07 14:18:04,965] [    INFO] - debug                         :True
[2024-06-07 14:18:04,966] [    INFO] - dev_path                      :/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/dev.txt
[2024-06-07 14:18:04,966] [    INFO] - early_stopping                :True
[2024-06-07 14:18:04,966] [    INFO] - early_stopping_patience       :5
[2024-06-07 14:18:04,966] [    INFO] - label_path                    :/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/label.txt
[2024-06-07 14:18:04,966] [    INFO] - max_length                    :128
[2024-06-07 14:18:04,966] [    INFO] - test_path                     :/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/dev.txt
[2024-06-07 14:18:04,966] [    INFO] - train_path                    :/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/data/train.txt
[2024-06-07 14:18:04,966] [    INFO] - 
[2024-06-07 14:18:04,966] [    INFO] - We are using <class 'paddlenlp.transformers.ernie.modeling.ErnieForSequenceClassification'> to load 'ernie-3.0-tiny-medium-v2-zh'.
[2024-06-07 14:18:04,967] [    INFO] - Loading weights file from cache at /root/.paddlenlp/models/ernie-3.0-tiny-medium-v2-zh/model_state.pdparams
[2024-06-07 14:18:05,391] [    INFO] - Loaded weights file from disk, setting weights to model.
W0607 14:18:05.395669 3517231 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.4, Runtime API Version: 11.8
W0607 14:18:05.396549 3517231 gpu_resources.cc:149] device: 0, cuDNN Version: 90.1.
[2024-06-07 14:18:05,848] [    INFO] - All model checkpoint weights were used when initializing ErnieForSequenceClassification.

[2024-06-07 14:18:05,848] [ WARNING] - Some weights of ErnieForSequenceClassification were not initialized from the model checkpoint at ernie-3.0-tiny-medium-v2-zh and are newly initialized: ['classifier.bias', 'classifier.weight', 'ernie.pooler.dense.bias', 'ernie.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2024-06-07 14:18:05,865] [    INFO] - We are using (<class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'>, False) to load 'ernie-3.0-tiny-medium-v2-zh'.
[2024-06-07 14:18:05,890] [    INFO] - tokenizer config file saved in /root/.paddlenlp/models/ernie-3.0-tiny-medium-v2-zh/tokenizer_config.json
[2024-06-07 14:18:05,890] [    INFO] - Special tokens file saved in /root/.paddlenlp/models/ernie-3.0-tiny-medium-v2-zh/special_tokens_map.json
[2024-06-07 14:18:05,892] [    INFO] - The global seed is set to 42, local seed is set to 43 and random seed is set to 42.
[2024-06-07 14:18:06,094] [   DEBUG] - ============================================================
[2024-06-07 14:18:06,094] [   DEBUG] -     Training Configuration Arguments    
[2024-06-07 14:18:06,094] [   DEBUG] - paddle commit id              : 41ba14f30600373df53839dbf763405cfacb5c92
[2024-06-07 14:18:06,094] [   DEBUG] - paddlenlp commit id           : 3105c18b013e1cdcbf860af1c6c54f4e33c88ee7
[2024-06-07 14:18:06,094] [   DEBUG] - _no_sync_in_gradient_accumulation: True
[2024-06-07 14:18:06,094] [   DEBUG] - activation_quantize_type      : None
[2024-06-07 14:18:06,094] [   DEBUG] - adam_beta1                    : 0.9
[2024-06-07 14:18:06,094] [   DEBUG] - adam_beta2                    : 0.999
[2024-06-07 14:18:06,094] [   DEBUG] - adam_epsilon                  : 1e-08
[2024-06-07 14:18:06,095] [   DEBUG] - algo_list                     : None
[2024-06-07 14:18:06,095] [   DEBUG] - amp_custom_black_list         : None
[2024-06-07 14:18:06,095] [   DEBUG] - amp_custom_white_list         : None
[2024-06-07 14:18:06,095] [   DEBUG] - amp_master_grad               : False
[2024-06-07 14:18:06,095] [   DEBUG] - batch_num_list                : None
[2024-06-07 14:18:06,095] [   DEBUG] - batch_size_list               : None
[2024-06-07 14:18:06,095] [   DEBUG] - bf16                          : False
[2024-06-07 14:18:06,095] [   DEBUG] - bf16_full_eval                : False
[2024-06-07 14:18:06,095] [   DEBUG] - bias_correction               : False
[2024-06-07 14:18:06,095] [   DEBUG] - current_device                : gpu:0
[2024-06-07 14:18:06,095] [   DEBUG] - data_parallel_config          : 
[2024-06-07 14:18:06,095] [   DEBUG] - data_parallel_rank            : 0
[2024-06-07 14:18:06,096] [   DEBUG] - dataloader_drop_last          : False
[2024-06-07 14:18:06,096] [   DEBUG] - dataloader_num_workers        : 0
[2024-06-07 14:18:06,096] [   DEBUG] - dataset_rank                  : 0
[2024-06-07 14:18:06,096] [   DEBUG] - dataset_world_size            : 1
[2024-06-07 14:18:06,096] [   DEBUG] - device                        : gpu
[2024-06-07 14:18:06,096] [   DEBUG] - disable_tqdm                  : False
[2024-06-07 14:18:06,096] [   DEBUG] - distributed_dataloader        : False
[2024-06-07 14:18:06,096] [   DEBUG] - do_compress                   : False
[2024-06-07 14:18:06,096] [   DEBUG] - do_eval                       : True
[2024-06-07 14:18:06,096] [   DEBUG] - do_export                     : True
[2024-06-07 14:18:06,096] [   DEBUG] - do_predict                    : False
[2024-06-07 14:18:06,096] [   DEBUG] - do_train                      : True
[2024-06-07 14:18:06,096] [   DEBUG] - enable_auto_parallel          : False
[2024-06-07 14:18:06,096] [   DEBUG] - eval_accumulation_steps       : None
[2024-06-07 14:18:06,097] [   DEBUG] - eval_batch_size               : 32
[2024-06-07 14:18:06,097] [   DEBUG] - eval_steps                    : None
[2024-06-07 14:18:06,097] [   DEBUG] - evaluation_strategy           : IntervalStrategy.EPOCH
[2024-06-07 14:18:06,097] [   DEBUG] - flatten_param_grads           : False
[2024-06-07 14:18:06,097] [   DEBUG] - force_reshard_pp              : False
[2024-06-07 14:18:06,097] [   DEBUG] - fp16                          : False
[2024-06-07 14:18:06,097] [   DEBUG] - fp16_full_eval                : False
[2024-06-07 14:18:06,097] [   DEBUG] - fp16_opt_level                : O1
[2024-06-07 14:18:06,097] [   DEBUG] - gradient_accumulation_steps   : 1
[2024-06-07 14:18:06,097] [   DEBUG] - greater_is_better             : True
[2024-06-07 14:18:06,097] [   DEBUG] - hybrid_parallel_topo_order    : pp_first
[2024-06-07 14:18:06,097] [   DEBUG] - ignore_data_skip              : False
[2024-06-07 14:18:06,097] [   DEBUG] - ignore_load_lr_and_optim      : False
[2024-06-07 14:18:06,098] [   DEBUG] - ignore_save_lr_and_optim      : False
[2024-06-07 14:18:06,098] [   DEBUG] - input_dtype                   : int64
[2024-06-07 14:18:06,098] [   DEBUG] - input_infer_model_path        : None
[2024-06-07 14:18:06,098] [   DEBUG] - label_names                   : None
[2024-06-07 14:18:06,098] [   DEBUG] - lazy_data_processing          : True
[2024-06-07 14:18:06,098] [   DEBUG] - learning_rate                 : 3e-05
[2024-06-07 14:18:06,098] [   DEBUG] - load_best_model_at_end        : True
[2024-06-07 14:18:06,098] [   DEBUG] - load_sharded_model            : False
[2024-06-07 14:18:06,098] [   DEBUG] - local_process_index           : 0
[2024-06-07 14:18:06,098] [   DEBUG] - local_rank                    : -1
[2024-06-07 14:18:06,098] [   DEBUG] - log_level                     : -1
[2024-06-07 14:18:06,098] [   DEBUG] - log_level_replica             : -1
[2024-06-07 14:18:06,098] [   DEBUG] - log_on_each_node              : True
[2024-06-07 14:18:06,099] [   DEBUG] - logging_dir                   : checkpoint/runs/Jun07_14-18-04_ktgpu
[2024-06-07 14:18:06,099] [   DEBUG] - logging_first_step            : False
[2024-06-07 14:18:06,099] [   DEBUG] - logging_steps                 : 40
[2024-06-07 14:18:06,099] [   DEBUG] - logging_strategy              : IntervalStrategy.STEPS
[2024-06-07 14:18:06,099] [   DEBUG] - logical_process_index         : 0
[2024-06-07 14:18:06,099] [   DEBUG] - lr_end                        : 1e-07
[2024-06-07 14:18:06,099] [   DEBUG] - lr_scheduler_type             : SchedulerType.LINEAR
[2024-06-07 14:18:06,099] [   DEBUG] - max_evaluate_steps            : -1
[2024-06-07 14:18:06,099] [   DEBUG] - max_grad_norm                 : 1.0
[2024-06-07 14:18:06,099] [   DEBUG] - max_steps                     : -1
[2024-06-07 14:18:06,100] [   DEBUG] - metric_for_best_model         : accuracy
[2024-06-07 14:18:06,100] [   DEBUG] - minimum_eval_times            : None
[2024-06-07 14:18:06,100] [   DEBUG] - moving_rate                   : 0.9
[2024-06-07 14:18:06,100] [   DEBUG] - no_cuda                       : False
[2024-06-07 14:18:06,100] [   DEBUG] - num_cycles                    : 0.5
[2024-06-07 14:18:06,100] [   DEBUG] - num_train_epochs              : 100.0
[2024-06-07 14:18:06,100] [   DEBUG] - onnx_format                   : True
[2024-06-07 14:18:06,100] [   DEBUG] - optim                         : OptimizerNames.ADAMW
[2024-06-07 14:18:06,100] [   DEBUG] - optimizer_name_suffix         : None
[2024-06-07 14:18:06,100] [   DEBUG] - output_dir                    : checkpoint
[2024-06-07 14:18:06,100] [   DEBUG] - overwrite_output_dir          : False
[2024-06-07 14:18:06,101] [   DEBUG] - past_index                    : -1
[2024-06-07 14:18:06,101] [   DEBUG] - per_device_eval_batch_size    : 32
[2024-06-07 14:18:06,101] [   DEBUG] - per_device_train_batch_size   : 32
[2024-06-07 14:18:06,101] [   DEBUG] - pipeline_parallel_config      : 
[2024-06-07 14:18:06,101] [   DEBUG] - pipeline_parallel_degree      : -1
[2024-06-07 14:18:06,101] [   DEBUG] - pipeline_parallel_rank        : 0
[2024-06-07 14:18:06,101] [   DEBUG] - power                         : 1.0
[2024-06-07 14:18:06,101] [   DEBUG] - prediction_loss_only          : False
[2024-06-07 14:18:06,101] [   DEBUG] - process_index                 : 0
[2024-06-07 14:18:06,101] [   DEBUG] - prune_embeddings              : False
[2024-06-07 14:18:06,102] [   DEBUG] - recompute                     : False
[2024-06-07 14:18:06,102] [   DEBUG] - remove_unused_columns         : True
[2024-06-07 14:18:06,102] [   DEBUG] - report_to                     : ['visualdl']
[2024-06-07 14:18:06,102] [   DEBUG] - resume_from_checkpoint        : None
[2024-06-07 14:18:06,102] [   DEBUG] - round_type                    : round
[2024-06-07 14:18:06,102] [   DEBUG] - run_name                      : checkpoint
[2024-06-07 14:18:06,102] [   DEBUG] - save_on_each_node             : False
[2024-06-07 14:18:06,102] [   DEBUG] - save_sharded_model            : False
[2024-06-07 14:18:06,102] [   DEBUG] - save_steps                    : 100
[2024-06-07 14:18:06,103] [   DEBUG] - save_strategy                 : IntervalStrategy.EPOCH
[2024-06-07 14:18:06,103] [   DEBUG] - save_total_limit              : 1
[2024-06-07 14:18:06,103] [   DEBUG] - scale_loss                    : 32768
[2024-06-07 14:18:06,103] [   DEBUG] - seed                          : 42
[2024-06-07 14:18:06,103] [   DEBUG] - sep_parallel_degree           : -1
[2024-06-07 14:18:06,103] [   DEBUG] - sharding                      : []
[2024-06-07 14:18:06,103] [   DEBUG] - sharding_degree               : -1
[2024-06-07 14:18:06,103] [   DEBUG] - sharding_parallel_config      : 
[2024-06-07 14:18:06,103] [   DEBUG] - sharding_parallel_degree      : -1
[2024-06-07 14:18:06,103] [   DEBUG] - sharding_parallel_rank        : 0
[2024-06-07 14:18:06,104] [   DEBUG] - should_load_dataset           : True
[2024-06-07 14:18:06,104] [   DEBUG] - should_load_sharding_stage1_model: False
[2024-06-07 14:18:06,104] [   DEBUG] - should_log                    : True
[2024-06-07 14:18:06,104] [   DEBUG] - should_save                   : True
[2024-06-07 14:18:06,104] [   DEBUG] - should_save_model_state       : True
[2024-06-07 14:18:06,104] [   DEBUG] - should_save_sharding_stage1_model: False
[2024-06-07 14:18:06,104] [   DEBUG] - skip_memory_metrics           : True
[2024-06-07 14:18:06,104] [   DEBUG] - skip_profile_timer            : True
[2024-06-07 14:18:06,104] [   DEBUG] - strategy                      : dynabert+ptq
[2024-06-07 14:18:06,105] [   DEBUG] - tensor_parallel_config        : 
[2024-06-07 14:18:06,105] [   DEBUG] - tensor_parallel_degree        : -1
[2024-06-07 14:18:06,105] [   DEBUG] - tensor_parallel_rank          : 0
[2024-06-07 14:18:06,105] [   DEBUG] - to_static                     : False
[2024-06-07 14:18:06,105] [   DEBUG] - train_batch_size              : 32
[2024-06-07 14:18:06,105] [   DEBUG] - unified_checkpoint            : False
[2024-06-07 14:18:06,105] [   DEBUG] - unified_checkpoint_config     : 
[2024-06-07 14:18:06,105] [   DEBUG] - use_hybrid_parallel           : False
[2024-06-07 14:18:06,105] [   DEBUG] - use_pact                      : True
[2024-06-07 14:18:06,105] [   DEBUG] - wandb_api_key                 : None
[2024-06-07 14:18:06,106] [   DEBUG] - warmup_ratio                  : 0.1
[2024-06-07 14:18:06,106] [   DEBUG] - warmup_steps                  : 0
[2024-06-07 14:18:06,106] [   DEBUG] - weight_decay                  : 0.0
[2024-06-07 14:18:06,106] [   DEBUG] - weight_name_suffix            : None
[2024-06-07 14:18:06,106] [   DEBUG] - weight_quantize_type          : channel_wise_abs_max
[2024-06-07 14:18:06,106] [   DEBUG] - width_mult_list               : None
[2024-06-07 14:18:06,106] [   DEBUG] - world_size                    : 1
[2024-06-07 14:18:06,106] [   DEBUG] - 
[2024-06-07 14:18:06,106] [    INFO] - Starting training from resume_from_checkpoint : None
/root/anaconda3/envs/paddlepaddle/lib/python3.9/site-packages/paddle/distributed/parallel.py:411: UserWarning: The program will return to single-card operation. Please check 1, whether you use spawn or fleetrun to start the program. 2, Whether it is a multi-card program. 3, Is the current environment multi-card.
  warnings.warn(
[2024-06-07 14:18:06,108] [    INFO] - [timelog] checkpoint loading time: 0.00s (2024-06-07 14:18:06) 
[2024-06-07 14:18:06,108] [    INFO] - ***** Running training *****
[2024-06-07 14:18:06,108] [    INFO] -   Num examples = 108
[2024-06-07 14:18:06,109] [    INFO] -   Num Epochs = 100
[2024-06-07 14:18:06,109] [    INFO] -   Instantaneous batch size per device = 32
[2024-06-07 14:18:06,109] [    INFO] -   Total train batch size (w. parallel, distributed & accumulation) = 32
[2024-06-07 14:18:06,109] [    INFO] -   Gradient Accumulation steps = 1
[2024-06-07 14:18:06,109] [    INFO] -   Total optimization steps = 400
[2024-06-07 14:18:06,109] [    INFO] -   Total num train samples = 10,800
[2024-06-07 14:18:06,111] [   DEBUG] -   Number of trainable parameters = 75,416,834 (per device)
TrainProcess:   0%|▌                                                                                                                                                                                                                                  | 1/400 [00:00<03:33,  1.87it/s][2024-06-07 14:18:06,713] [    INFO] - ***** Running Evaluation *****
[2024-06-07 14:18:06,713] [    INFO] -   Num examples = 0
[2024-06-07 14:18:06,713] [    INFO] -   Total prediction steps = 0
[2024-06-07 14:18:06,714] [    INFO] -   Pre device batch size = 32
[2024-06-07 14:18:06,714] [    INFO] -   Total Batch size = 32
[2024-06-07 14:18:06,716] [    INFO] - eval_runtime: 0.0029, eval_samples_per_second: 0.0, eval_steps_per_second: 0.0, progress_or_epoch: 1.0
[2024-06-07 14:18:06,717] [ WARNING] - early stopping required metric_for_best_model, but did not find eval_accuracy so early stopping is disabled
[2024-06-07 14:18:06,717] [    INFO] - Saving model checkpoint to checkpoint/checkpoint-4
[2024-06-07 14:18:06,718] [    INFO] - tokenizer config file saved in checkpoint/checkpoint-4/tokenizer_config.json
[2024-06-07 14:18:06,718] [    INFO] - Special tokens file saved in checkpoint/checkpoint-4/special_tokens_map.json
[2024-06-07 14:18:06,723] [    INFO] - Configuration saved in checkpoint/checkpoint-4/config.json
[2024-06-07 14:18:08,455] [    INFO] - Model weights saved in checkpoint/checkpoint-4/model_state.pdparams
[2024-06-07 14:18:08,455] [    INFO] - Saving optimizer files.
Traceback (most recent call last):
  File "/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/multi_class/train.py", line 230, in <module>
    main()
  File "/2T/Langchain-Ch/vanna2/modeltrain/PaddleNLP/applications/text_classification/multi_class/train.py", line 180, in main
    train_result = trainer.train()
  File "/root/anaconda3/envs/paddlepaddle/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 768, in train
    return self._inner_training_loop(
  File "/root/anaconda3/envs/paddlepaddle/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 1089, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, model, epoch, ignore_keys_for_eval, inputs=inputs)
  File "/root/anaconda3/envs/paddlepaddle/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 1318, in _maybe_log_save_evaluate
    self._save_checkpoint(model, metrics=metrics)
  File "/root/anaconda3/envs/paddlepaddle/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 2170, in _save_checkpoint
    metric_value = metrics[metric_to_check]
KeyError: 'eval_accuracy'
TrainProcess:   1%|██▎                                                                                                                                                                                                                                | 4/400 [00:06<11:01,  1.67s/it]

PaddlePaddle / PaddleNLP

[Question]: 使用text classification的微调训练显示KeyError: 'eval_accuracy' #8566

请提出你的问题