Closed attardi closed 4 years ago
Does it behave similarly on PTB?
I only use UD. On the UD_English_EWT the release branch achieves:
2020-06-14 20:48:00 INFO Epoch 240 / 1000:
2020-06-14 20:48:54 INFO dev: - loss: 0.7806 - UCM: 56.04% LCM: 45.90% UAS: 87.41% LAS: 82.86%
2020-06-14 20:49:06 INFO test: - loss: 0.8076 - UCM: 50.16% LCM: 38.12% UAS: 89.39% LAS: 85.30%
2020-06-14 20:49:07 INFO 0:01:06.209307s elapsed (saved)
with this configuration:
bert_model | bert-base-cased
n_embed | 100
n_char_embed | 50
n_feat_embed | 100
n_bert_layers | 4
while the dev branch achieves:
Epoch 177 / 1000:
train: Loss: 0.2244 UAS: 95.55% LAS: 92.70%
dev: Loss: 0.6326 UAS: 93.16% LAS: 90.24%
test: Loss: 1.1072 UAS: 91.84% LAS: 89.23%
0:01:00.618501s elapsed (saved)
although with a different model and configuration:
bert_model | TurkuNLP/wikibert-base-en-cased
n_embed | 100
n_char_embed | 50
n_feat_embed | 0
n_bert_layers | 0
embed_dropout | 0.33
n_lstm_hidden | 400
n_lstm_layers | 2
I am running the same configuration as the dev branch, but it doesn't look promising.
bert | TurkuNLP/wikibert-base-en-cased
n_embed | 100
n_char_embed | 50
n_feat_embed | 0
n_bert_layers | 0
embed_dropout | 0.33
n_lstm_hidden | 400
n_lstm_layers | 2
At epoch 26, release gets:
2020-06-15 13:58:19 INFO Epoch 26 / 1000:
2020-06-15 13:58:54 INFO dev: - loss: 1.0003 - UCM: 46.15% LCM: 36.16% UAS: 82.75% LAS: 76.09%
2020-06-15 13:59:01 INFO test: - loss: 1.0627 - UCM: 36.27% LCM: 25.12% UAS: 84.05% LAS: 77.21%
2020-06-15 13:59:03 INFO 0:00:42.405768s elapsed (saved)
while the dev branch was already at:
Epoch 26 / 1000:
train: Loss: 0.5479 UAS: 89.65% LAS: 84.42%
dev: Loss: 0.5415 UAS: 91.82% LAS: 88.41%
test: Loss: 0.8552 UAS: 91.26% LAS: 88.26%
Sorry, the release branch is still in development and some bugs may lurk in the code. I will do some checks on PTB later.
I did an experiment with the dev branch using the new model electra-base-discrimintator and achieved an improvement on UD_English_EWT. From bert-base-cased:
Epoch 177 / 1000: train: Loss: 0.2244 UAS: 95.55% LAS: 92.70% dev: Loss: 0.6326 UAS: 93.16% LAS: 90.24% test: Loss: 1.1072 UAS: 91.84% LAS: 89.23%
to electra-base-discriminator:
Epoch 128 / 1000: train: Loss: 0.2552 UAS: 95.10% LAS: 91.90% dev: Loss: 0.4805 UAS: 94.49% LAS: 91.90% test: Loss: 1.1904 UAS: 91.73% LAS: 89.20%
I also notice a drop in performance on PTB. Something maybe erroneously modified.
Hi @attardi, the bug has been fixed.
It' because I didn't figure out the usage of from_config
. To load the model weights, we should use from_pretrained
instead of from_config
.
I also tested the XLNet model on te dev branch, as you suggested. It is less accurate:
Epoch 147 / 1000: train: Loss: 0.2343 UAS: 95.27% LAS: 92.30% dev: Loss: 0.5309 UAS: 93.50% LAS: 90.83% test: Loss: 1.0762 UAS: 91.94% LAS: 89.35%
What does less accurate mean @attardi? Is their any exception on dev branch?
I notice a significant drop in performance in the release branch with respect to the dev branch using the same configuration with 2 n_lstm_layers and all bert data (n_feat_embed=0). Here is an example on the UD Italian corpus.
Dev version:
0:01:20.747564s elapsed (saved)
Release version:
With 4 n_bert_layers and 100 n_feats_embed dev and release brach perform similarly:
it works better:
which is similar to the dev branch:
n_bert_layers | 4
embed_dropout | 0.33
n_lstm_hidden | 400
n_lstm_layers | 3
train: Loss: 0.2213 UAS: 96.16% LAS: 93.16% dev: Loss: 0.3626 UAS: 95.90% LAS: 93.51% test: Loss: 0.4056 UAS: 95.13% LAS: 92.63% 0:01:07.875049s elapsed (saved)
What could be the reason? You suggested that using all layers and all features from BERT would have been beneficial, and indeed it was in the dev branch.