hlt-mt / FBK-fairseq

Repository containing the open source code of works published at the FBK MT unit.
Other
42 stars 1 forks source link

[Question for AlignAtt] #5

Closed makotos-nlproc closed 1 year ago

makotos-nlproc commented 1 year ago

Hi, thank you so much for your great work and ope access.

I'd like to reproduce AlignAtt, especially to replicate data preparation and trainig.

Can you provide a guide containing a script or command for data-prepration and traing?

Thanks,

mgaido91 commented 1 year ago

Hi, thanks for your question. You can follow this README to train the Conformer-based models (without bug related to padding management): https://github.com/hlt-mt/FBK-fairseq/blob/master/fbk_works/BUGFREE_CONFORMER.md.

Please, notice that the models used in the paper were trained with KD from NLLB, which means that you also need to create similar training TSV where in the tgt_text column you put the output of NLLB fed with the manual transcripts.

Hope this helps, thanks!

mgaido91 commented 1 year ago

I am closing this as it has been stale for a while. Feel free to reopen if anything else is needed.