elliottd / GroundedTranslation

Multilingual image description
https://staff.fnwi.uva.nl/d.elliott/GroundedTranslation/
BSD 3-Clause "New" or "Revised" License
46 stars 25 forks source link

Customisable #30

Closed evanmiltenburg closed 7 years ago

evanmiltenburg commented 7 years ago

Hi Des,

If you'd like, I can push all my changes back to the GT repo. Here are the ones to make caption generation work with an Embedding layer. The output layer currently is still the same as it was before, because that seems to be the only way to make it work. This does mean I had to add some additional functions to make the target different from the input.

elliottd commented 7 years ago

This pull request requires the embedding dimension to be tied to the RNN dimension. Can you add in an additional Dense layer between the Embedding and RNN to allow arbitrary learned transformations between these layers?

Also, have you confirmed that this change does not substantially decrease the BLEU score compared to the original implementation?

evanmiltenburg commented 7 years ago

I've added the Dense layer, but performance isn't the best ever. This is the best epoch on Val during training: INFO:Callbacks:Meteor = 14.80 | BLEU = 16.63 | TER = 59.24.

/Edit: Hmm, it gives me an error when I try to run generate.py:

  File "generate.py", line 707, in <module>
    w.generate()
  File "generate.py", line 97, in generate
    self.generate_sentences(self.args.init_from_checkpoint, val=not self.args.test)
  File "generate.py", line 298, in generate_sentences
    verbose=0)
  File "/data/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 1162, in predict
    check_batch_dim=False)
  File "/data/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 108, in standardize_input_data
    str(array.shape))
Exception: Error when checking : expected text to have shape (None, 10) but got array with shape (100, 80)

Need to fix that first.

evanmiltenburg commented 7 years ago

Base functionality is there. Here are the results:

With this command: THEANO_FLAGS=floatX=float32,device=gpu0,lib.cnmem=1.0 python generate.py --dataset flickr30k --hidden_size=256 --fixed_seed --run_string=fixed_seed-eng300mlm --meteor_lang=en --embed_size 300 --debug --model_checkpoints=checkpoints/fixed_seed-eng300mlm --beam_width=5 --no_pplx --multeval --generation_timesteps=30 --verbose

Val

BLEU = 17.49, 59.6/26.4/11.5/5.2 (BP=1.000, ratio=1.009, hyp_len=10609, ref_len=10512)
INFO:__main__:Meteor = 16.17 | BLEU = 17.49 | TER = 65.35

Test:

BLEU = 16.93, 58.9/25.6/11.2/4.9 (BP=1.000, ratio=1.006, hyp_len=10364, ref_len=10299)
INFO:__main__:Meteor = 15.97 | BLEU = 16.94 | TER = 66.40

Only issue: it doesn't work yet when you leave out --no_pplx.