Closed Infinitay closed 1 year ago
unfortunately its currently Linux only. So we have to see if and when they fix it for Windows.
Still interesting and maybe i find a solution myself. They say fairseq2n "can be ignored" though its a requirement for fairseq2 and only builds on Linux...
Thank you. This is now implemented with the new release (https://github.com/Sharrnah/whispering/releases/tag/v1.3.11.1)
Only for S2TT and T2TT for now and only using the transformer lib and no quantization yet.
Facebook just released a new multimodal model for multiple languages. I would assume it's the successor to NLLB. One model to rule them all. At first glance, it seems that the size of the SM4T Large matches that of NLLB Large alone. Furthermore, CT2 would be great. For whispering users that takes advantage of all the available x-to-x features, this model would be good to support
Website: https://ai.meta.com/resources/models-and-libraries/seamless-communication/ Code: https://github.com/facebookresearch/seamless_communication Paper: https://ai.meta.com/research/publications/seamless-m4t/ Blog Post: https://ai.meta.com/blog/seamless-m4t/
Some Metrics
![image](https://github.com/Sharrnah/whispering/assets/6964154/99ba2dc2-af3b-4375-85cb-a39baa660753) ![image](https://github.com/Sharrnah/whispering/assets/6964154/18a8df3a-d848-42e8-b696-63bf42cfa9b4) ![image](https://github.com/Sharrnah/whispering/assets/6964154/26630236-ff4f-4c0f-b1b8-76a3582b2602)