Open zyl535544721 opened 3 years ago
我用双卡double了一下batch size per gpu,微调了lr之后可以复现ACE2005上的效果
我用双卡double了一下batch size per gpu,微调了lr之后可以复现ACE2005上的效果
能说下具体参数吗? 谢谢
其他都是repo里提供的ace05.sh里的原参数了
-----原始邮件----- 发件人:zyl535544721 @.> 发送时间:2021-05-13 17:01:55 (星期四) 收件人: ShannonAI/mrc-for-flat-nested-ner @.> 抄送: MrZixi @.>, Comment @.> 主题: Re: [ShannonAI/mrc-for-flat-nested-ner] Reproducing the ACE2005 Results (#83)
我用双卡double了一下batch size per gpu,微调了lr之后可以复现ACE2005上的效果
能说下具体参数吗? 谢谢
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Can you share the specific parameters? Thanks!
accumulate_grad_batches: 1 adam_epsilon: 1.0e-08 amp_backend: native amp_level: O2 auto_lr_find: false auto_scale_batch_size: false auto_select_gpus: false batch_size: 16 benchmark: false bert_config_dir: /path/to/mrc-for-flat-nested-ner-master/bert-large-uncased bert_dropout: 0.1 check_val_every_n_epoch: 1 checkpoint_callback: true chinese: false data_dir: /path/to/mrc-for-flat-nested-ner-master/data/ace2005 default_root_dir: /path/to/mrc-for-flat-nested-ner-master/train_logs/ace2005/ace2005_reproduce deterministic: false dice_smooth: 1.0e-08 distributed_backend: ddp early_stop_callback: false fast_dev_run: false final_div_factor: 10000.0 flat: false gpus: 0,1 gradient_clip_val: 1.0 limit_test_batches: 1.0 limit_train_batches: 1.0 limit_val_batches: 1.0 log_gpu_memory: null log_save_interval: 100 logger: true loss_type: bce lr: 1.0e-05 max_epochs: 20 max_length: 128 max_steps: null min_epochs: 1 min_steps: null mrc_dropout: 0.4 num_nodes: 1 num_processes: 1 num_sanity_val_steps: 2 optimizer: adamw overfit_batches: 0.0 overfit_pct: null precision: 16 prepare_data_per_node: true pretrained_checkpoint: '' process_position: 0 profiler: null progress_bar_refresh_rate: 1 reload_dataloaders_every_epoch: false replace_sampler_ddp: true resume_from_checkpoint: null row_log_interval: 50 span_loss_candidates: pred_and_gold sync_batchnorm: false terminate_on_nan: false test_percent_check: null track_grad_norm: -1 train_percent_check: null truncated_bptt_steps: null val_check_interval: 0.25 val_percent_check: null warmup_steps: 0 weight_decay: 0.01 weight_end: 1.0 weight_span: 0.1 weight_start: 1.0 weights_save_path: null weights_summary: top workers: 0
我用您的方法在ACE2005上实验出的效果只有0.7983,跟您的论文里的结果差了7个点,不知道是哪里出问题了,还是参数的问题。 非常期待您的回答