lvapeab / nmt-keras

Neural Machine Translation with Keras
http://nmt-keras.readthedocs.io
MIT License
532 stars 130 forks source link

repetition in translation #63

Closed lchunleo closed 6 years ago

lchunleo commented 6 years ago

Hi

i tried with a 36K training dataset with 10K validation data set , trained with the default settings with 25 epochs . However, when i do a translation from foreign language to english, it shows weird result as there is no context meaning.

My dataset r comprised of words and sentences that of typical general writing. So is it possible my datasets does not show variant of the sentences structure? If so how much variant do I need?

(Ground truth Foreign language-equivalent English meaning: Besides having to apply for a licence from the Land Transport Authority, all dockless bike-sharing operators will have to regulate their fleet sizes, pay a registration fee for each bicycle and ensure that their bikes are parked in designated areas.

(Translated english meaning words)

notified a having of a driver's from a driver's license dictionary of the joint adults dictionary of the joint adults dictionary of the brought up to the battery power for friends for some bicycle and made up with their bicycle and made up with their bicycle and made up with their bicycle and made their bicycle and made up with their bicycle and made up with their bicycle and made their bicycle and made up with their bicycle and made up with their bicycle and made their bicycle and made up with their bicycle and made up with their

epoch_24

lvapeab commented 6 years ago

This is a typical example of an ill-trained model. The standard configuration set in the config.py is meant to be a minimal system. You'll definitely need to update the configuration to another one that suits better your problem. As I said to you (https://github.com/lvapeab/nmt-keras/issues/56), you'll need to change the hyperparameters.

lchunleo commented 6 years ago

thank you for your reply. i had did made some minor changes for the hyperparameters but i guess i need to relook again to understand better. Thank for your advice.