zh_onto4数据集结果复现问题

你好，我们在复现命名实体识别数据集zh_onto4结果时，按照readme的指导，运行的是scripts/ner_zhonto4/bert_dice.sh. 脚本，脚本超参没有修改过，但测试集 spanF1的分数只有80.80，与文章中的84.47的结果差距较大，运行日志和测试结果见下文，麻烦看一下是哪个地方的问题，多谢！

h-4.3$sh scripts/ner_zhonto4/bert_dice.sh DEBUG INFO -> loss sign is dice_1_0.3_0.01 DEBUG INFO -> save hyperparameters DEBUG INFO -> pred_answerable train_infer DEBUG INFO -> check bert_config BertForQueryNERConfig { "activate_func": "relu", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "construct_entity_span": "start_and_end", "directionality": "bidi", "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "LABEL_0" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "LABEL_0": 0 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "pooler_fc_size": 768, "pooler_num_attention_heads": 12, "pooler_num_fc_layers": 3, "pooler_size_per_head": 128, "pooler_type": "first_token_transform", "pred_answerable": true, "truncated_normal": false, "type_vocab_size": 2, "vocab_size": 21128 }

Some weights of the model checkpoint at /home/ma-user/work/bert-base-chinese were not used when initializing BertForQueryNER: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']

This IS expected if you are initializing BertForQueryNER from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertForQueryNER from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of BertForQueryNER were not initialized from the model checkpoint at /home/ma-user/work/bert-base-chinese and are newly initialized: ['start_outputs.dense_layer.weight', 'start_outputs.dense_layer.bias', 'start_outputs.dense_to_labels_layer.weight', 'start_outputs.dense_to_labels_layer.bias', 'end_outputs.dense_layer.weight', 'end_outputs.dense_layer.bias', 'end_outputs.dense_to_labels_layer.weight', 'end_outputs.dense_to_labels_layer.bias', 'answerable_cls_output.dense_layer.weight', 'answerable_cls_output.dense_layer.bias', 'answerable_cls_output.dense_to_labels_layer.weight', 'answerable_cls_output.dense_to_labels_layer.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. /opt/conda/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: Checkpoint directory /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01 exists and is not empty with save_top_k != 0.All files in this directory will be deleted when a checkpoint is saved! warnings.warn(*args, *kwargs) GPU available: True, used: True TPU available: False, using: 0 TPU cores CUDA_VISIBLE_DEVICES: [0] Using native 16bit precision. /opt/conda/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: Could not log computational graph since the model.example_input_array attribute is not set or input_array was not given warnings.warn(args, **kwargs)

| Name | Type | Params

0 | model | BertForQueryNER | 104 M 1 | evaluation_metric | MRCNERSpanF1 | 0
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: The dataloader, val dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 72 which is the number of cpus on this machine) in theDataLoaderinit to improve performance. warnings.warn(*args, **kwargs) Validation sanity check: 0it [00:00, ?it/s]Truncation was not explicitly activated butmax_lengthis provided a specific value, please usetruncation=Trueto explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy totruncation. /opt/conda/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of thenum_workersargument (try 72 which is the number of cpus on this machine) in the DataLoader init to improve performance. warnings.warn(*args, **kwargs) Epoch 0: 25%|██████████████████████████████▍ | 19168/76678 [09:30<28:32, 33.58it/s, loss=0.318, v_num=4] Epoch 00000: val_f1 reached 0.69758 (best 0.69758), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=0.ckpt as top 3 /opt/conda/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:216: UserWarning: Please also save or load the state of the optimizer when saving or loading the scheduler. warnings.warn(SAVE_STATE_WARNING, UserWarning) Epoch 0: 50%|████████████████████████████████████████████████████████████▉ | 38336/76678 [19:07<19:07, 33.41it/s, loss=0.297, v_num=4] Epoch 00000: val_f1 reached 0.72641 (best 0.72641), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=0_v0.ckpt as top 3 Epoch 0: 50%|██████��██████████████████████████████████████████████████████ | 38346/76678 [19:11<19:11, 33.29it/s, loss=0.312, v_num=4Epoch 0: 75%|███████████████████████████████████████████████████████████████████████████████████████████▍ | 57504/76678 [28:56<09:38, 33.12it/s, loss=0.210, v_num=4] Epoch 00000: val_f1 reached 0.76375 (best 0.76375), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=0_v1.ckpt as top 3 Epoch 0: 92%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 70608/76678 [36:58<03:10, 31.83it/s, loss=0.296, v_num=4] Epoch 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 76672/76678 [38:33<00:00, 33.15it/s, loss=0.296, v_num=4]Epoch 00000: val_f1 reached 0.73197 (best 0.76375), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=0_v2.ckpt as top 3 Epoch 1: 20%|██████████��█████████████ | 15152/76678 [08:12<33:20, 30.75it/s, loss=0.125, v_num=4Epoch 1: 25%|██████████████████████████████▍ | 19168/76678 [09:16<27:50, 34.43it/s, loss=0.125, v_num=4] Epoch 00001: val_f1 reached 0.75373 (best 0.76375), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=1.ckpt as top 3 Epoch 1: 46%|███████████████████████████████████████████████████████▉ | 35128/76678 [17:49<21:04, 32.85it/s, loss=0.175, v_num=4] Epoch 1: 50%|████████████████████████████████████████████████████████████▉ | 38336/76678 [18:40<18:40, 34.21it/s, loss=0.175, v_num=4]Epoch 00001: val_f1 reached 0.75231 (best 0.76375), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=1_v0.ckpt as top 3 Epoch 1: 75%|███████████████████████████████████████████████████████████████████████████████████████████▍ | 57504/76678 [28:08<09:22, 34.06it/s, loss=0.241, v_num=4] Epoch 00001: val_f1 was not in top 3███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 17200/17204 [04:34<00:00, 66.96it/s] Epoch 1: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 76672/76678 [37:23<00:00, 34.17it/s, loss=0.149, v_num=4] Epoch 00001: val_f1 reached 0.76958 (best 0.76958), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=1_v1.ckpt as top 3 Epoch 2: 25%|██████████████████████████████▍ | 19168/76678 [09:19<28:00, 34.23it/s, loss=0.159, v_num=4] Epoch 00002: val_f1 reached 0.76353 (best 0.76958), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=2.ckpt as top 3 Epoch 2: 50%|████████████████████████████████████████████████████████████▉ | 38336/76678 [18:47<18:47, 34.01it/s, loss=0.157, v_num=4] Epoch 00002: val_f1 was not in top 3███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 17202/17204 [04:32<00:00, 66.10it/s] Epoch 2: 75%|███████████████████████████████████████████████████████████████████████████████████████████▍ | 57504/76678 [28:04<09:21, 34.14it/s, loss=0.220, v_num=4] Epoch 00002: val_f1 was not in top 3███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 17202/17204 [04:32<00:00, 66.10it/s] Epoch 2: 75%|██████��█████████████████████████████████████████████████████████████████████████████████████ | 57856/76678 [28:55<09:24, 33.33it/s, loss=0.080, v_num=4Epoch 2: 83%|█████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 63672/76678 [34:02<06:57, 31.17it/s, loss=0.129, v_num=4] Epoch 2: 96%|██████████████████████████████████████████████████████████��██████████████████████████████████████████████████████████▌ | 73904/76678 [36:42<01:22, 33.55it/s, loss=0.129, v_num=Epoch 2: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 76672/76678 [37:26<00:00, 34.12it/s, loss=0.129, v_num=4] Epoch 00002: val_f1 reached 0.78048 (best 0.78048), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=2_v0.ckpt as top 3 Epoch 3: 25%|██████████████████████████████▏ | 18984/76678 [09:18<28:17, 33.99it/s, loss=0.119, v_num=4] Epoch 3: 25%|██████████████████████████████▍ | 19168/76678 [09:21<28:04, 34.15it/s, loss=0.119, v_num=4]Epoch 00003: val_f1 reached 0.76468 (best 0.78048), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=3.ckpt as top 3 Epoch 3: 36%|█████████████████████��█████████████████████▊ | 27528/76678 [15:57<28:29, 28.75it/s, loss=0.084, v_num=Epoch 3: 50%|████████████████████████████████████████████████████████████▉ | 38336/76678 [18:51<18:51, 33.88it/s, loss=0.084, v_num=4] Epoch 00003: val_f1 reached 0.76513 (best 0.78048), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=3_v0.ckpt as top 3 Epoch 3: 50%|██████████████��██████████████████████████████████████████████▏ | 38424/76678 [19:07<19:02, 33.49it/s, loss=0.067, v_num=Epoch 3: 59%|███████████████��███████████████████████████████████████████████████████▉ | 45224/76678 [25:09<17:29, 29.96it/s, loss=0.065, v_num=Epoch 3: 75%|███████████████████████████████████████████████████████████████████████████████████████████▍ | 57504/76678 [28:17<09:26, 33.87it/s, loss=0.065, v_num=4] Epoch 00003: val_f1 reached 0.77975 (best 0.78048), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=3.ckpt as top 3 Epoch 3: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 76672/76678 [37:48<00:00, 33.79it/s, loss=0.115, v_num=4] Epoch 00003: val_f1 reached 0.77402 (best 0.78048), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=3_v0.ckpt as top 3 Epoch 4: 25%|██████████████████████████████▍ | 19168/76678 [09:50<29:32, 32.45it/s, loss=0.141, v_num=4] Epoch 00004: val_f1 was not in top 3███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 17198/17204 [04:47<00:00, 66.49it/s] Epoch 4: 50%|████████████████████████████████████████████████████████████▉ | 38336/76678 [19:56<19:56, 32.05it/s, loss=0.162, v_num=4] Epoch 00004: val_f1 reached 0.78181 (best 0.78181), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=4.ckpt as top 3 Epoch 4: 75%|███████████████████████████████████████████████████████████████████████████████████████████▍ | 57504/76678 [30:48<10:16, 31.11it/s, loss=0.092, v_num=4] Epoch 00004: val_f1 reached 0.78524 (best 0.78524), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=4_v0.ckpt as top 3 Epoch 4: 77%|██████████��██████████████████████████████████████████████████████████████████████████████████▊ | 58938/76678 [35:09<10:35, 27.93it/s, loss=0.122, v_num=Epoch 4: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 76672/76678 [41:29<00:00, 30.80it/s, loss=0.141, v_num=4] Epoch 00004: val_f1 reached 0.78198 (best 0.78524), saving model to /home/ma-user/work/dice_loss_for_NLP-master/output/dice_loss/mrc_ner/reproduce_zhonto_dice_base_8_300_2e-5_polydecay_0.1_2_5_1.0_0.002_0.1_1_1_0.3_dice_1_0.3_0.01/epoch=4_v1.ckpt as top 3 Epoch 4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 76678/76678 [41:32<00:00, 30.76it/s, loss=0.141, v_num=4]Saving latest checkpoint..
Epoch 4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 76678/76678 [41:32<00:00, 30.76it/s, loss=0.141, v_num=4] /opt/conda/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:37: UserWarning: The dataloader, test dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 72 which is the number of cpus on this machine) in theDataLoader` init to improve performance. warnings.warn(*args, **kwargs) Testing: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 17380/17384 [04:51<00:00, 59.82it/s]-------------------------------------------------------------------------------- DATALOADER:0 TEST RESULTS {'test_span_f1': tensor(0.8080, device='cuda:0'), 'test_span_precision': tensor(0.8282, device='cuda:0'), 'test_span_recall': tensor(0.7888, device='cuda:0')}

ShannonAI / dice_loss_for_NLP

zh_onto4数据集结果复现问题 #22

| Name | Type | Params