facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation
Other
10.79k stars 1.05k forks source link

LM Rescoring for Seamless text decoder #366

Open Sameep-c opened 7 months ago

Sameep-c commented 7 months ago

Can we use an external LM rescoring model such as KenLM for the text decoder part of Seamless M4T for tasks such as ASR or S2T translation?

avidale commented 7 months ago

Of course we can! A challenging part would be to properly align the tokens from the language model and from Seamless. I am not sure there is code that you can apply out of the box for this, but it is certainly a solvable task.

But I think that LM rescoring with Seamless doesn't make as much sense as with CTC-based ASR models, because the Seamless text decoder is already an autoregressive transformer language model on its own.