danhorvath / levenshtein_transformer

Transformer implementation based on http://nlp.seas.harvard.edu/2018/04/03/attention.html
4 stars 3 forks source link

Reuslt #1

Closed lizhenping closed 4 years ago

lizhenping commented 4 years ago

I have run your code on a tesla V100 with 2card for a night .it seems that .Loss is not very easy to convergence,and get a very bad bleu score ? e you just make it for toy? to learn levenshtein_transformer or i make some place wrong?

lizhenping commented 4 years ago

of curse you have done a great job on levenshtein_transformer for the beginer to understand this mode .i just a litter wonder? have you get a good result ?

danhorvath commented 4 years ago

I used this repo to understand the LevT better by implementing it. However I stopped developing it once I managed to get fairseq running on my cluster.

However, I did test this implementation on the a multi30k eng-de dataset, and for that I got 22.3 BLEU against the fairseq LevT implementation that got 22.2 for the same task. But admittedly the fairseq implementation was more than 7 times faster. It is also maintained regularly with a very high standard so I advise you to use that.

lizhenping commented 4 years ago

I used this repo to understand the LevT better by implementing it. However I stopped developing it once I managed to get fairseq running on my cluster.

However, I did test this implementation on the a multi30k eng-de dataset, and for that I got 22.3 BLEU against the fairseq LevT implementation that got 22.2 for the same task. But admittedly the fairseq implementation was more than 7 times faster. It is also maintained regularly with a very high standard so I advise you to use that.

Great job.perhaps I get something wrong in understanding and parameters set .thank you very much .fairseq's source code is really a little hard for beginer.LOL

danhorvath commented 4 years ago

I'm glad you find my code useful. And yes, it takes some time to understand it but afterwards it pays off, especially considering the gains in speed.