allanj / ner_with_dependency

GNU General Public License v3.0
72 stars 11 forks source link

Is the F1 value in your paper the best result? #5

Closed Youarerare closed 4 years ago

Youarerare commented 4 years ago

Hi! I'm bothering you again ... I recently read your paper and try to reproduce your results. The result for SemEval-2010 Task 1 in your paper is as follows:

semeval 2010 result

and I get the highest F1 82.83 and 84.14 respectively different from your result 82.19(DGLSTM-CRF L=1) and 83.47 (DGLSTM-CRF L=2).

I wonder if my config is not set correctly or do you give conservative values?

I just run python main.py --device cuda:0 --dep_model dglstm --momentum 0.9 --lr_decay 0.02 --dataset spanish --embedding_file data/cc.es.300.vec

allanj commented 4 years ago

You might get different values in different physical setup though, overall the value should be quite similar. Also, I think my lr_decay value was 0, it's good to see you get better performance. Probably it is the optimizer difference.

Youarerare commented 4 years ago

Thank you, and I have another problem. When the layer of DGlstm-CRF is> 1, the result of starting training is often like this.

Epoch 1: 212762.29077, Time is 158.19s [dev set] Precision: 0.00, Recall: 0.00, F1: 0.00 [test set] Precision: 0.00, Recall: 0.00, F1: 0.00 saving the best model... learning rate is set to: 0.00980392156862745 Epoch 2: 175617.80145, Time is 161.74s [dev set] Precision: 0.00, Recall: 0.00, F1: 0.00 [test set] Precision: 0.00, Recall: 0.00, F1: 0.00 learning rate is set to: 0.009615384615384616 Epoch 3: 150908.33441, Time is 162.54s [dev set] Precision: 0.00, Recall: 0.00, F1: 0.00 [test set] Precision: 0.00, Recall: 0.00, F1: 0.00 learning rate is set to: 0.009433962264150943

Is there any way to solve it?

allanj commented 4 years ago

That is weird, for spanish, using L = 2, the first few epochs look like this:

learning rate is set to:  0.01
Epoch 1: 121991.84515, Time is 85.04s
[dev set] Precision: 0.00, Recall: 0.00, F1: 0.00
[test set] Precision: 66.67, Recall: 0.07, F1: 0.13
saving the best model...
learning rate is set to:  0.01
Epoch 2: 63909.13715, Time is 80.64s
[dev set] Precision: 59.72, Recall: 52.50, F1: 55.88
[test set] Precision: 59.09, Recall: 51.35, F1: 54.94
saving the best model...
learning rate is set to:  0.01
Epoch 3: 39851.13495, Time is 82.03s
[dev set] Precision: 65.68, Recall: 68.72, F1: 67.17
[test set] Precision: 67.46, Recall: 70.22, F1: 68.81
saving the best model...
learning rate is set to:  0.01
Epoch 4: 30115.89642, Time is 80.87s
[dev set] Precision: 72.99, Recall: 73.58, F1: 73.28
[test set] Precision: 75.25, Recall: 75.15, F1: 75.20
saving the best model...

Can you let me know what arguments you change to get that?

Youarerare commented 4 years ago

In fact, when L = 2, my result is like this, learning rate is set to: 0.01 Epoch 1: 136990.46906, Time is 152.11s [dev set] Precision: 0.00, Recall: 0.00, F1: 0.00 [test set] Precision: 0.00, Recall: 0.00, F1: 0.00 saving the best model... learning rate is set to: 0.00980392156862745 Epoch 2: 71881.46161, Time is 155.85s [dev set] Precision: 40.56, Recall: 40.23, F1: 40.39 [test set] Precision: 38.45, Recall: 38.18, F1: 38.31 saving the best model... learning rate is set to: 0.009615384615384616 Epoch 3: 47582.08258, Time is 152.94s [dev set] Precision: 60.29, Recall: 60.15, F1: 60.22 [test set] Precision: 60.48, Recall: 60.14, F1: 60.31 saving the best model... learning rate is set to: 0.009433962264150943 Epoch 4: 36670.16547, Time is 152.66s [dev set] Precision: 66.00, Recall: 63.98, F1: 64.97 [test set] Precision: 68.02, Recall: 66.05, F1: 67.02i

The above is all 0 is in the case of L = 3,It seems that this situation will become more serious as L increases.

Youarerare commented 4 years ago

Add another point,on my computer, when L = 3, the first 8 epoch F1s are all 0, I even want to break the code, because I think it may not increase anymore. But then I observed that it was rising, this is an interesting phenomenon.

allanj commented 4 years ago

In my case, the three epochs are like this. I think making the layers deeper somehow need to increase the learning rate for faster training though I didn't do so.

[Info] The model will be saved to: model_files/lstm_3_200_crf_semes_sd_-1_dep_feat_emb_elmo_none_sgd_gate_0_base_-1_epoch_300_lr_0.01_doubledep_0_comb_3.m, please ensure models folder exist
learning rate is set to:  0.01
Epoch 1: 154613.84131, Time is 100.93s
[dev set] Precision: 0.00, Recall: 0.00, F1: 0.00
[test set] Precision: 0.00, Recall: 0.00, F1: 0.00
saving the best model...
learning rate is set to:  0.01
Epoch 2: 126278.49561, Time is 96.01s
[dev set] Precision: 0.00, Recall: 0.00, F1: 0.00
[test set] Precision: 0.00, Recall: 0.00, F1: 0.00
learning rate is set to:  0.01
Epoch 3: 126282.97369, Time is 96.08s
[dev set] Precision: 0.00, Recall: 0.00, F1: 0.00
[test set] Precision: 0.00, Recall: 0.00, F1: 0.00
learning rate is set to:  0.01
Epoch 4: 125613.44739, Time is 95.81s
[dev set] Precision: 24.15, Recall: 3.79, F1: 6.55
[test set] Precision: 22.72, Recall: 3.02, F1: 5.33
saving the best model...
learning rate is set to:  0.01
Epoch 5: 91701.53497, Time is 95.17s
[dev set] Precision: 47.37, Recall: 10.33, F1: 16.95
[test set] Precision: 45.53, Recall: 9.19, F1: 15.30
saving the best model...
learning rate is set to:  0.01
Epoch 6: 62165.22260, Time is 98.44s
[dev set] Precision: 48.12, Recall: 44.59, F1: 46.29
[test set] Precision: 49.47, Recall: 45.57, F1: 47.44
saving the best model...
learning rate is set to:  0.01
Epoch 7: 42350.93250, Time is 95.81s
[dev set] Precision: 56.20, Recall: 51.47, F1: 53.73
[test set] Precision: 57.22, Recall: 52.07, F1: 54.52
saving the best model...