Closed gaoyuan211 closed 2 years ago
执行脚本在shell上的完整输出如下: sudo bash scripts/run_dureader.sh
jog.0日志输出
args.is_distributed: True
worker_endpoints:['127.0.1.1:6170'] trainers_num:1 current_endpoint:127.0.1.1:6170 trainer_id:0
Device count 1, trainer_id:0
args.vocab_path ./configs/base/zh/vocab.txt
Traceback (most recent call last):
File "run_mrc.py", line 322, in
test.json 日志里指定的预训练数据为json后缀,readme里生成的是txt。 这里好像是冲突了,但是不知道咋解决。
http://ai.stanford.edu/~amaas/data/sentiment/index.html
python multi_files_to_one.py # this will generate train/test txt
生成train.txt与test.txt文件至该文件夹下
stream_job: None test_set: ./data/imdb//test.json tokenizer: FullTokenizer train_all: False train_set: ./data/imdb//train.json use_amp: False use_cuda: True 以上是日志里的文件路径
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reopen it. Thank you for your contributions.
目前已搭建环境,并执行到如下两个截图,无法进展下一步,请协助,非常感谢! 执行脚本: sudo bash scripts/run_dureader.sh
输出日志为: ----------- Configuration Arguments ----------- current_node_ip: 127.0.1.1 node_id: 0 node_ips: 127.0.1.1 nproc_per_node: 4 print_config: True selected_gpus: 0,1,2,3 split_log_path: log training_script: run_mrc.py training_script_args: ['--use_cuda', 'true', '--is_distributed', 'true', '--batch_size', '16', '--in_tokens', 'false', '--use_fast_executor', 'true', '--checkpoints', './output', '--vocab_path', './configs/base/zh/vocab.txt', '--do_train', 'true', '--do_val', 'true', '--do_test', 'false', '--save_steps', '10000', '--validation_steps', '100', '--warmup_proportion', '0.1', '--weight_decay', '0.01', '--epoch', '5', '--max_seq_len', '512', '--ernie_config_path', './configs/base/zh/ernie_config.json', '--do_lower_case', 'true', '--doc_stride', '128', '--train_set', './data/finetune/task_data/dureader//train.json', '--dev_set', './data/finetune/task_data/dureader//dev.json', '--test_set', './data/finetune/task_data/dureader//test.json', '--learning_rate', '2.75e-4', '--num_iteration_per_drop_scope', '1', '--lr_scheduler', 'linear_warmup_decay', '--layer_decay_ratio', '0.8', '--is_zh', 'True', '--repeat_input', 'False', '--train_all', 'Fasle', '--eval_all', 'False', '--use_vars', 'False', '--init_checkpoint', '', '--init_pretraining_params', '', '--init_loss_scaling', '32768', '--use_recompute', 'False', '--skip_steps', '10']
all_trainer_endpoints: 127.0.1.1:6170,127.0.1.1:6171,127.0.1.1:6172,127.0.1.1:6173 , node_id: 0 , current_ip: 127.0.1.1 , num_nodes: 1 , node_ips: ['127.0.1.1'] , gpus_per_proc: 1 , selected_gpus_per_proc: [['0'], ['1'], ['2'], ['3']] , nranks: 4