nyu-dl / dl4mt-tutorial

BSD 3-Clause "New" or "Revised" License
618 stars 249 forks source link

Session 1 #32

Closed vince62s closed 8 years ago

vince62s commented 8 years ago

Hi, Is it "correct" that session 1 test after training gives meaningless sentences ?

thanks

federicohyo commented 8 years ago

What do you exactly mean by "meaningless"? can you quote a couple here? on which data did you train your model?

vince62s commented 8 years ago

I trained the europarl as is in the session 1 script files. Meaningless in the sense that almost no sentence has been translated correctly, a bunch of UNK too. below first 10 lines of each newstest2011.en.tok and newstest2011.trans.fr.tok

What will they do ? CSSD lacks knowledge of both Voldemort and candy bars in Prague New Councilors of CSSD will most probably have to overcome certain language barriers to understand their old-new colleagues from ODS in Prague Council and municipal council . Aktuálně.cz " tested " the Social Democrat members of the new Council in terms of the well-established slang that originated in the town hall during the few last years , when Prague was ruled by the current coalition partners . Coded vocabulary that was established by Prague political elite during the previous era of the mayor Pavel Bem , describes some of the most famous persons , situations and affairs in the city . Surprisingly , it turned out that the new council members do not understand the well-known concepts . At least they say so . " Who is Voldemort ? " " I really do not know . " " I 'm rather a novice in Prague politics , " responded Lukas Kaucky , the Councilor for culture , to the test of " Godfather " vocabulary . And even though he is a political veteran , the Councilor Karel Brezina responded similarly .

Qu' est-ce qu' ils UNK ? Peut-être membres . UNK " . La plupart des affaires étrangères . comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que les membres ont pas comprendre que moins . UNK " ? " Selon moi , je cite , je considère que la culture " UNK . Et pourtant une réponse .

vince62s commented 8 years ago

re ran it with GPU GTX 980 ti .... ah ah amazingly fast.

but still stopping after epoch 2 updates 140000 and bad translations.....

what is wrong ?

orhanf commented 8 years ago

@vince62s, from my experience, it is not very beneficial to early stop according to validation set log probabilities. I may suggest using BLEU for that, but anyways, may be you can increase patience to let it train a little bit more.

Also, session1 is just a warm-up for session2, is there a particular reason that you're not using session2 for nmt?

vince62s commented 8 years ago

OK I will run session2. now you trigger my curiosity. How do I stop learning based on BLEU, is this embedded already in the code ? if not and using "patience" what kind of good value do we need to use ? I see 10 in one place and 1000 in another place. What does this mean ?

orhanf commented 8 years ago

Early stopping based on BLEU is straightforward if you do it in the training loop, which means that you effectively pause the training, and call translate.py, then call a script (or a function) to compute BLEU given translation and reference files. Keep track of the BLEU scores as it is already done for validation log-probabilities, and finally early stop according to patience.

If you don't want to pause the training, since translating a validation set takes time, you can save your model parameters without overwriting each other (giving different names for each params.npz). Then another process can be employed to compute BLEU scores using the saved model parameters.

Actually adding this feature makes the code complicated and this codebase provides a starter/reference implementation of NMT. If you want to have it in your fork, you can check these two reference implementations, in blocks and groundhog

This is also what @amirj asked in #33 .

vince62s commented 8 years ago

thanks. otherwise regarding the "patience" setting, what 10 or 1000 relates to ?

kyunghyuncho commented 8 years ago

@orhanf can we make the code so that we can optionally save all the intermediate models to be evaluated later manually for early-stopping based on BLEU?

orhanf commented 8 years ago

@kyunghyuncho i've added an option overwrite to save the model parameters according to the iteration number without overwriting each other, also synched session 2 and 3 fixing #29