Hello, I want to use the MASS model for supervised machine translation tasks (EN-DE), so how do I prepare the data before binarization? For example, what is monolingual data? How to make a bpe? How to make a tokenizer? You only provide a directory on the EN-ZH translation. Can you provide a script for processing?
Looking forward to your reply, thank you very much!
@StillKeepTry
Hello, I want to use the MASS model for supervised machine translation tasks (EN-DE), so how do I prepare the data before binarization? For example, what is monolingual data? How to make a bpe? How to make a tokenizer? You only provide a directory on the EN-ZH translation. Can you provide a script for processing?
Looking forward to your reply, thank you very much! @StillKeepTry