PaddlePaddle / RocketQA

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
Apache License 2.0
766 stars 128 forks source link

Load pretraining parameters from dureader_cross_encoder 後終止訓練 #80

Closed yslu-TW closed 1 year ago

yslu-TW commented 1 year ago

Load pretraining parameters from dureader_cross_encoder 後跳出程序 dureader_cross_encoder有下載下來

image

程式碼

import paddle
import rocketqa
cross_encoder = rocketqa.load_model(model="zh_dureader_ce", use_cuda=True, device_id=1, batch_size=16)
cross_encoder.train('C:/ChaseAI/Chatbot/RocketQA-main/examples/data/cross.train.tsv', 2, 'ce_models', save_steps=1000, learning_rate=1e-5, log_folder='log_ce')

log

W1227 08:33:52.983285 19932 gpu_resources.cc:61] Please NOTE: device: 1, GPU Compute Capability: 7.0, Driver API Version: 11.3, Runtime API Version: 11.2

W1227 08:33:57.905314 19932 gpu_resources.cc:91] device: 1, cuDNN Version: 8.2.
[INFO] 2022-12-26 11:26:31,443 [     args.py:   69]:    -----------  Configuration Arguments -----------
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    batch_size: 16
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    checkpoints: checkpoints
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    chunk_scheme: IOB
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    decr_every_n_nan_or_inf: 2
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    decr_ratio: 0.8
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    dev_set: None
[INFO] 2022-12-26 11:26:31,443 [     args.py:   71]:    diagnostic: None
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    diagnostic_save: None
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    do_lower_case: True
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    do_test: True
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    do_train: False
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    do_val: False
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    doc_stride: 128
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    enable_ce: False
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    epoch: 2
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    ernie_config_path: C:\Users\Administrator/.rocketqa/zh_dureader_ce/zh_config.json
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    for_cn: True
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    in_tokens: False
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    incr_every_n_steps: 100
[INFO] 2022-12-26 11:26:31,459 [     args.py:   71]:    incr_ratio: 2.0
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    init_checkpoint: C:\Users\Administrator/.rocketqa/zh_dureader_ce/dureader_cross_encoder
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    init_loss_scaling: 102400
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    init_pretraining_params: None
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    is_classify: True
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    is_distributed: False
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    is_regression: False
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    label_map_config: None
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    learning_rate: 1e-05
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    log_folder: log_ce
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    lr_scheduler: linear_warmup_decay
[INFO] 2022-12-26 11:26:31,474 [     args.py:   71]:    max_answer_length: 100
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    max_query_length: 64
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    max_seq_len: 384
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    metric: simple_accuracy
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    metrics: True
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    model_name: zh_dureader_ce
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    n_best_size: 20
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    num_iteration_per_drop_scope: 10
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    num_labels: 2
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    output_file_name: None
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    output_item: 3
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    p_max_seq_len: 256
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    predict_batch_size: None
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    q_max_seq_len: 32
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    random_seed: None
[INFO] 2022-12-26 11:26:31,490 [     args.py:   71]:    save_model_path: ce_models
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    save_steps: 1000
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    shuffle: True
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    skip_steps: 100
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    task_id: 0
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    test_data_cnt: 1110000
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    test_save: ./checkpoints/test_result
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    test_set: None
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    tokenizer: FullTokenizer
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    train_data_size: 0
[INFO] 2022-12-26 11:26:31,505 [     args.py:   71]:    train_set: C:/ChaseAI/Chatbot/RocketQA-main/examples/data/cross_train_test.tsv
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_cross_batch: False
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_cuda: True
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_dynamic_loss_scaling: True
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_fast_executor: True
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_lamb: False
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_mix_precision: False
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_multi_gpu_test: False
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    use_recompute: False
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    validation_steps: 1000
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    verbose: False
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    vocab_path: C:\Users\Administrator/.rocketqa/zh_dureader_ce/zh_vocab.txt
[INFO] 2022-12-26 11:26:31,521 [     args.py:   71]:    warmup_proportion: 0.1
[INFO] 2022-12-26 11:26:31,537 [     args.py:   71]:    weight_decay: 0.01
[INFO] 2022-12-26 11:26:31,537 [     args.py:   72]:    ------------------------------------------------
[INFO] 2022-12-26 11:26:31,568 [reader_ce_train.py:  244]:  apply sharding 0/1
[INFO] 2022-12-26 11:26:31,584 [cross_encoder.py:  238]:    Device count: 1
[INFO] 2022-12-26 11:26:31,584 [cross_encoder.py:  239]:    Num train examples: 30
[INFO] 2022-12-26 11:26:31,584 [cross_encoder.py:  240]:    Max train steps: 3
[INFO] 2022-12-26 11:26:31,584 [cross_encoder.py:  241]:    Num warmup steps: 0
[INFO] 2022-12-26 11:26:31,584 [cross_encoder.py:  242]:    Learning rate: 0.000010
[WARNING] 2022-12-26 11:26:31,584 [       io.py:  721]: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[INFO] 2022-12-26 11:26:36,270 [     init.py:   74]:    Load pretraining parameters from C:\Users\Administrator/.rocketqa/zh_dureader_ce/dureader_cross_encoder.
yslu-TW commented 1 year ago

pip install rocketqa==1.1.0版本 cross_encoder.py沒有儲存config.json那幾行程式,github上有,

save_path = os.path.join(args.save_model_path,
                                            "step_" + str(steps))
fluid.io.save_persistables(self.exe, save_path, train_program)
config_save_path = os.path.join(args.save_model_path, "config.json")
json.dump(self.config_dict, open(config_save_path, "w"))
shutil.copy(args.ernie_config_path, args.save_model_path)
shutil.copy(args.vocab_path, args.save_model_path)

然後訓練完成不會跳任何訊息 可以把錯誤訊息印出來看看

except fluid.core.EOFException as eof:
                log.info(
                    "error: %s" %
                    (eof))

另外可加上把skip_steps印出來才知道有正常再運行 cross_encoder.py中 def _parse_train_args中補上

if "skip_steps" in config_dict:
            self.args.skip_steps = config_dict['skip_steps']

example.py cross_encoder.train(train_set, 2, 'ce_qa_models', save_steps=10,skip_steps=10, learning_rate=1e-5, log_folder='ce_qa_log')