We could use monolingual NLLB data in the target language. I looked at it and it's lower quality, so I'm hesitant to use it for back-translations to augment teacher training. It's not a problem if we use this data to produce back-translations only to use them later in forward translation as a part of knowledge distillation.
It can help reduce the teacher-student quality gap where we have little monolingual data in the source language.
See: From Research to Production and Back: Ludicrously Fast Neural Machine Translation
We could use monolingual NLLB data in the target language. I looked at it and it's lower quality, so I'm hesitant to use it for back-translations to augment teacher training. It's not a problem if we use this data to produce back-translations only to use them later in forward translation as a part of knowledge distillation.