Closed evanmiltenburg closed 6 years ago
Thanks! Do these new options introduce any major performance regressions?
On Mon, 20 Nov 2017, 19:01 Emiel van Miltenburg, notifications@github.com wrote:
Here's the next update to the code. The embeddings are now initialisable for both input and output, using separate keywords.
You can view, comment on, or merge this pull request online at:
https://github.com/elliottd/GroundedTranslation/pull/32 Commit Summary
- Add arguments to initialize model with embeddings.
- Add embeddings keyword argument
- Move embeddings argument to buildKerasModel.
- Add code to load the embeddings.
- Add comments and delete word embeddings to save memory.
- Make input word embeddings initialisable.
- Fix error on line 59: if array does not work.
- Update train.py
- Make it possible to fix initialized weights.
- Pass fix_weights parameter
- Make output layer suitable for embeddings.
- Make output initialisable
File Changes
- M models.py https://github.com/elliottd/GroundedTranslation/pull/32/files#diff-0 (53)
- M train.py https://github.com/elliottd/GroundedTranslation/pull/32/files#diff-1 (35)
Patch Links:
- https://github.com/elliottd/GroundedTranslation/pull/32.patch
- https://github.com/elliottd/GroundedTranslation/pull/32.diff
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/elliottd/GroundedTranslation/pull/32, or mute the thread https://github.com/notifications/unsubscribe-auth/ABWqf5j2_0Sp7TGdZ-WHJnIp-gQsUrJ9ks5s4b6SgaJpZM4Qkqev .
I'm training a plain model at the moment. Will update with the performance later.
Val:
BLEU = 16.57, 58.1/25.7/10.6/4.8 (BP=1.000, ratio=1.011, hyp_len=10515, ref_len=10404) INFO:main:Meteor = 15.85 | BLEU = 16.57 | TER = 65.50
Test:
BLEU = 15.65, 57.5/24.5/10.2/4.2 (BP=0.998, ratio=0.998, hyp_len=10125, ref_len=10148) INFO:main:Meteor = 15.57 | BLEU = 15.65 | TER = 65.95
Does look like some regression is taking place. (about 1 BLEU and <0.5 Meteor).
Fun fact: Meteor and BLEU diverge more if you initialize both input and output embeddings. Here's the results for Test with both initialized with the GoogleNews vectors:
BLEU = 14.78, 51.9/22.5/9.7/4.2 (BP=1.000, ratio=1.071, hyp_len=13920, ref_len=12995) INFO:main:Meteor = 16.83 | BLEU = 14.78 | TER = 86.38
Competitive Meteor, but BLEU suffers!
Accepting the pull request because there is no guarantee that initialising the model from pre-trained embeddings should improve performance. And the pull request makes it possible to still use the default option.
Here's the next update to the code. The embeddings are now initialisable for both input and output, using separate keywords.