[CLOSED] MT: fix target endtoken; masking encoder context; add beam search

nyu-mll / jiant-v1-legacy

The jiant toolkit for general-purpose text understanding models

MIT License

21 stars 9 forks source link

[CLOSED] MT: fix target endtoken; masking encoder context; add beam search #184

Closed jeswan closed 4 years ago

jeswan commented 4 years ago

Issue by yukatherin Tuesday Jul 17, 2018 at 23:40 GMT Originally opened as https://github.com/nyu-mll/jiant/pull/184

I believe all comments on https://github.com/jsalt18-sentence-repl/jiant/pull/175 are addressed except validation/eval, which will do now. Will do bleu scoring tomorrow

Adds targ_fn which appends STARTTOKEN, ENDTOKEN to target sentence in load_tsv
We had adding the starttoken embedding in the forward stage (now we have a literal token in the target sentence) so removed this, and lined up input tokens [:-1] with target tokens [1:]
general careful cleanup around seq2seq_decoder.py
adds args.mt_attention which only affects MT/reddit runs
currently takes out dropout on the decoder (can put this back)

Tested beamsearch by overfitting to 3 sentences and generating them back.

Without attention:

With attention:

yukatherin included the following code: https://github.com/nyu-mll/jiant/pull/184/commits

jeswan commented 4 years ago

Comment by iftenney Wednesday Jul 18, 2018 at 02:24 GMT

Closing for now; please re-open when ready for review.

jeswan commented 4 years ago

Comment by yukatherin Wednesday Jul 18, 2018 at 02:33 GMT

@EdouardGrave It would be really good if you could review this - whenever you have time.

jeswan commented 4 years ago

Comment by sleepinyourhat Wednesday Jul 18, 2018 at 14:56 GMT

Does this fix the no-attention bug you'd mentioned on Slack yesterday, or is that a separate issue?

Also, reviewing now.

jeswan commented 4 years ago

Comment by sleepinyourhat Wednesday Jul 18, 2018 at 15:01 GMT

Also, is BLEU ready? I see code for it, but the comments on this thread suggest it isn't...

jeswan commented 4 years ago

Comment by yukatherin Wednesday Jul 18, 2018 at 18:31 GMT

bleu scores in the python version look reasonably ok, but still need to write out predictions and match fairseq bleu scoring

jeswan commented 4 years ago

Comment by yukatherin Wednesday Jul 18, 2018 at 19:03 GMT

@sleepinyourhat At what point could we consider merging this? Should I get one run's learning curve with the python version of bleu scoring? I think @W4ngatang was waiting on it for the masking fix.

jeswan commented 4 years ago

Comment by sleepinyourhat Wednesday Jul 18, 2018 at 19:34 GMT

I'm waiting for confirmation of what this is supposed to do before I give it a final review. See the questions above about BLEU and the bug.

jeswan commented 4 years ago

Comment by yukatherin Wednesday Jul 18, 2018 at 19:48 GMT

Yes, the main points are:

Fix masking in passing encoder context
Add actual end token
Add beam search (which has been about 80-90% validated based on the 3 sentences test)
Add preliminary python version of bleu scoring [we could turn this off so we could land the first two changes]

I started a real run: python main.py --config config/defaults.conf --overrides "exp_name = test-wmt, train_tasks = wmt14_en_de, eval_tasks = wmt14_en_de, run_name = test-wmt1-noelmo, elmo = 1, elmo_chars_only = 1, dropout = 0.2, lr = 0.003, val_interval = 10000, max_vals = 50, batch_size = 24, max_word_v_size = 30000, max_seq_len=20, mt_attention=none"

and the bleu scores are a bit suspiciously high after 10k batches - have not compared the different versions carefully. Will do that and evaluate using the other version as well

jeswan commented 4 years ago

Comment by sleepinyourhat Wednesday Jul 18, 2018 at 20:00 GMT

Cool. Either mark BLEU as experimental (with lots of warnings) or remove it. Then test on demo.conf. If everything works then, let me know, and I can merge.

jeswan commented 4 years ago

Comment by yukatherin Wednesday Jul 18, 2018 at 20:45 GMT

@sleepinyourhat Yes, I printed one warning in bleu_scoring.py and tested on demo.conf. I think it's ready.