Closed lizhenping closed 4 years ago
of curse you have done a great job on levenshtein_transformer for the beginer to understand this mode .i just a litter wonder? have you get a good result ?
I used this repo to understand the LevT better by implementing it. However I stopped developing it once I managed to get fairseq running on my cluster.
However, I did test this implementation on the a multi30k eng-de dataset, and for that I got 22.3 BLEU against the fairseq LevT implementation that got 22.2 for the same task. But admittedly the fairseq implementation was more than 7 times faster. It is also maintained regularly with a very high standard so I advise you to use that.
I used this repo to understand the LevT better by implementing it. However I stopped developing it once I managed to get fairseq running on my cluster.
However, I did test this implementation on the a multi30k eng-de dataset, and for that I got 22.3 BLEU against the fairseq LevT implementation that got 22.2 for the same task. But admittedly the fairseq implementation was more than 7 times faster. It is also maintained regularly with a very high standard so I advise you to use that.
Great job.perhaps I get something wrong in understanding and parameters set .thank you very much .fairseq's source code is really a little hard for beginer.LOL
I'm glad you find my code useful. And yes, it takes some time to understand it but afterwards it pays off, especially considering the gains in speed.
I have run your code on a tesla V100 with 2card for a night .it seems that .Loss is not very easy to convergence,and get a very bad bleu score ? e you just make it for toy? to learn levenshtein_transformer or i make some place wrong?