elliottd / GroundedTranslation

Multilingual image description
https://staff.fnwi.uva.nl/d.elliott/GroundedTranslation/
BSD 3-Clause "New" or "Revised" License
46 stars 25 forks source link

Option to include an arbitrary number of training datasets ("super"training) #3

Closed elliottd closed 9 years ago

elliottd commented 9 years ago

It could be useful to estimate the parameters inside the VRNN using multiple datasets. The more image--description pairs we see, the more reliable the model parameters.

Specifically, we only have 16,000 image--description pairs in the IAPR-TC12 dataset. This is less image--description pairs than in the Flickr8K training data (6,000 training images => 30,000 image--description training instances.)