zjunlp / DocuNet

[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation
MIT License
132 stars 20 forks source link

I am unable to obtain the results you presented. #26

Closed WingRainSpring closed 1 year ago

WingRainSpring commented 1 year ago

Running this code with the dataset DocRED, I got the result: | epoch 0 | step 100 | min/b 0.11 | lr [5.45950864422202e-07, 1.8198362147406733e-06, 5.459508644222019e-06] | train loss 5290.696 | epoch 0 | step 200 | min/b 0.05 | lr [1.091901728844404e-06, 3.6396724294813467e-06, 1.0919017288444038e-05] | train loss 5288.723 | epoch 0 | step 300 | min/b 0.05 | lr [1.637852593266606e-06, 5.4595086442220205e-06, 1.6378525932666058e-05] | train loss 5289.938 | epoch 0 | step 400 | min/b 0.05 | lr [2.183803457688808e-06, 7.279344858962693e-06, 2.1838034576888075e-05] | train loss 5287.533 | epoch 0 | step 500 | min/b 0.05 | lr [2.72975432211101e-06, 9.099181073703366e-06, 2.7297543221110096e-05] | train loss 5288.655 | epoch 0 | step 600 | min/b 0.05 | lr [3.275705186533212e-06, 1.0919017288444041e-05, 3.2757051865332116e-05] | train loss 5290.393 | epoch 0 | step 700 | min/b 0.05 | lr [3.821656050955414e-06, 1.2738853503184714e-05, 3.8216560509554137e-05] | train loss 5288.530 | epoch 0 | step 800 | min/b 0.05 | lr [4.367606915377616e-06, 1.4558689717925387e-05, 4.367606915377615e-05] | train loss 5288.781 | epoch 0 | step 900 | min/b 0.05 | lr [4.913557779799819e-06, 1.637852593266606e-05, 4.913557779799818e-05] | train loss 5288.198 | epoch 0 | step 1000 | min/b 0.05 | lr [5.45950864422202e-06, 1.8198362147406733e-05, 5.459508644222019e-05] | train loss 5289.798 | epoch 0 | step 1100 | min/b 0.05 | lr [6.005459508644222e-06, 2.0018198362147407e-05, 6.005459508644222e-05] | train loss 5289.863 | epoch 0 | step 1200 | min/b 0.05 | lr [6.551410373066424e-06, 2.1838034576888082e-05, 6.551410373066423e-05] | train loss 5288.075 | epoch 0 | step 1300 | min/b 0.06 | lr [7.097361237488627e-06, 2.3657870791628757e-05, 7.097361237488626e-05] | train loss 5288.787 | epoch 0 | step 1400 | min/b 0.06 | lr [7.643312101910828e-06, 2.5477707006369428e-05, 7.643312101910827e-05] | train loss 5289.309 | epoch 0 | step 1500 | min/b 0.06 | lr [8.189262966333029e-06, 2.72975432211101e-05, 8.189262966333029e-05] | train loss 5289.554 | epoch 0 | step 1600 | min/b 0.06 | lr [8.735213830755232e-06, 2.9117379435850774e-05, 8.73521383075523e-05] | train loss 5289.977 | epoch 0 | step 1700 | min/b 0.06 | lr [9.281164695177434e-06, 3.093721565059145e-05, 9.281164695177434e-05] | train loss 5288.903 | epoch 0 | step 1800 | min/b 0.06 | lr [9.827115559599637e-06, 3.275705186533212e-05, 9.827115559599635e-05] | train loss 5288.367 | epoch 0 | step 1900 | min/b 0.07 | lr [1.0373066424021838e-05, 3.4576888080072794e-05, 0.00010373066424021837] | train loss 5287.382 | epoch 0 | step 2000 | min/b 0.07 | lr [1.091901728844404e-05, 3.6396724294813465e-05, 0.00010919017288444038] | train loss 5290.210 | epoch 0 | step 2100 | min/b 0.06 | lr [1.1464968152866244e-05, 3.821656050955414e-05, 0.00011464968152866242] | train loss 5287.919 | epoch 0 | step 2200 | min/b 0.06 | lr [1.2010919017288445e-05, 4.0036396724294815e-05, 0.00012010919017288444] | train loss 5289.426 | epoch 0 | step 2300 | min/b 0.06 | lr [1.2556869881710646e-05, 4.1856232939035486e-05, 0.00012556869881710645] | train loss 5289.294 | epoch 0 | step 2400 | min/b 0.06 | lr [1.3102820746132848e-05, 4.3676069153776164e-05, 0.00013102820746132846] | train loss 5291.173 | epoch 0 | step 2500 | min/b 0.06 | lr [1.364877161055505e-05, 4.5495905368516835e-05, 0.00013648771610555048] | train loss 5289.310 | epoch 0 | step 2600 | min/b 0.06 | lr [1.4194722474977254e-05, 4.731574158325751e-05, 0.00014194722474977252] | train loss 5289.745 | epoch 0 | step 2700 | min/b 0.06 | lr [1.4740673339399455e-05, 4.9135577797998184e-05, 0.00014740673339399453] | train loss 5289.868 | epoch 0 | step 2800 | min/b 0.06 | lr [1.5286624203821656e-05, 5.0955414012738855e-05, 0.00015286624203821655] | train loss 5289.504 | epoch 0 | step 2900 | min/b 0.06 | lr [1.583257506824386e-05, 5.2775250227479533e-05, 0.0001583257506824386] | train loss 5290.610 | epoch 0 | step 3000 | min/b 0.06 | lr [1.6378525932666058e-05, 5.45950864422202e-05, 0.00016378525932666057] | train loss 5289.182

| epoch 0 | time: 70.27s | dev_result:{'dev_F1': 0.07251287103460864, 'dev_F1_ign': 0.060782931255661345, 'dev_re_p': 0.03663268949627104, 'dev_re_r': 3.5299845816765396, 'dev_average_loss': 5.324710902690888}

| epoch 0 | best_f1:0.0007251287103460864 ............ | epoch 29 | step 44300 | min/b 0.06 | lr [1.0317423432634662e-06, 3.4391411442115537e-06, 1.031742343263466e-05] | train loss 5287.676 | epoch 29 | step 44400 | min/b 0.13 | lr [9.620300227726913e-07, 3.206766742575638e-06, 9.620300227726912e-06] | train loss 5289.456 | epoch 29 | step 44500 | min/b 0.14 | lr [8.923177022819166e-07, 2.974392340939722e-06, 8.923177022819166e-06] | train loss 5288.328 | epoch 29 | step 44600 | min/b 0.15 | lr [8.226053817911419e-07, 2.7420179393038063e-06, 8.226053817911418e-06] | train loss 5288.538 | epoch 29 | step 44700 | min/b 0.15 | lr [7.528930613003671e-07, 2.5096435376678905e-06, 7.52893061300367e-06] | train loss 5288.066 | epoch 29 | step 44800 | min/b 0.15 | lr [6.831807408095924e-07, 2.2772691360319747e-06, 6.831807408095923e-06] | train loss 5287.657 | epoch 29 | step 44900 | min/b 0.15 | lr [6.134684203188176e-07, 2.044894734396059e-06, 6.1346842031881754e-06] | train loss 5288.435 | epoch 29 | step 45000 | min/b 0.15 | lr [5.43756099828043e-07, 1.8125203327601433e-06, 5.437560998280429e-06] | train loss 5288.132 | epoch 29 | step 45100 | min/b 0.16 | lr [4.7404377933726816e-07, 1.5801459311242273e-06, 4.7404377933726815e-06] | train loss 5288.229 | epoch 29 | step 45200 | min/b 0.15 | lr [4.0433145884649346e-07, 1.3477715294883117e-06, 4.0433145884649346e-06] | train loss 5287.576 | epoch 29 | step 45300 | min/b 0.15 | lr [3.3461913835571875e-07, 1.1153971278523957e-06, 3.346191383557187e-06] | train loss 5288.117 | epoch 29 | step 45400 | min/b 0.16 | lr [2.64906817864944e-07, 8.8302272621648e-07, 2.64906817864944e-06] | train loss 5288.126 | epoch 29 | step 45500 | min/b 0.15 | lr [1.9519449737416926e-07, 6.506483245805642e-07, 1.9519449737416924e-06] | train loss 5288.336 | epoch 29 | step 45600 | min/b 0.16 | lr [1.2548217688339452e-07, 4.182739229446484e-07, 1.254821768833945e-06] | train loss 5288.346 | epoch 29 | step 45700 | min/b 0.15 | lr [5.5769856392619785e-08, 1.8589952130873263e-07, 5.576985639261979e-07] | train loss 5286.913

| epoch 29 | time: 51.71s | dev_result:{'dev_F1': 0.06700784829423147, 'dev_F1_ign': 0.05671531404503352, 'dev_re_p': 0.033853348984865014, 'dev_re_r': 3.245962833725554, 'dev_average_loss': 5.296513883590698}

How can I solve this problem?

njcx-ai commented 1 year ago

First of all, your training results deviate too much from the results of the paper. Considering that there is a problem in the training process, it is recommended that you re-download the code to check the training process.

Moreover, there are two more suggestions for this issue:

To fine-tune the RoBERTa-large model, a larger batch size may help you get higher F1-score. We use 4 GPUs with models NVIDIA GeForce RTX 3090 in training time. Thus the hyper-parameters tuning may be necessary to reproduce the result in your configurations. Since our model and default setting of hyper-parameters is friendly to the BERT-base and RoBERTa-base model fine-tuning, it's more efficient to reproduce the result on the X-base models. We suggest you try this way. Thx!

WingRainSpring commented 1 year ago

First of all, your training results deviate too much from the results of the paper. Considering that there is a problem in the training process, it is recommended that you re-download the code to check the training process.

Moreover, there are two more suggestions for this issue:

To fine-tune the RoBERTa-large model, a larger batch size may help you get higher F1-score. We use 4 GPUs with models NVIDIA GeForce RTX 3090 in training time. Thus the hyper-parameters tuning may be necessary to reproduce the result in your configurations. Since our model and default setting of hyper-parameters is friendly to the BERT-base and RoBERTa-base model fine-tuning, it's more efficient to reproduce the result on the X-base models. We suggest you try this way. Thx!

Thank you for your reply. After my testing, I found that there was a problem when dealing with the OOM issue, which caused a significant difference in results. It has now been fixed.