SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
7.53k stars 852 forks source link

Translate using NLLB API #7457

Closed sharadagg closed 9 months ago

sharadagg commented 9 months ago

Hi

I wonder if an option to translate using NLLB api can be added. That would make it really useful. I find NLLB translations are quite accurate.

People could point to either self hosted NLLB api servers or something online like https://winstxnhdw-nllb-api.hf.space/api/v2/translate (from https://github.com/winstxnhdw/nllb-api)

Thank you for considering.

niksedk commented 9 months ago

Very interesting :)

niksedk commented 9 months ago

Just a quick test... how is this? https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

image

darnn commented 9 months ago

Seems to work here, albeit slowly. Now that I've downloaded this version, though, is there really no way to go back to this layout? se no waveform When I try to hide the waveform it just takes me to the layout selection, and that's not one of the options there.

niksedk commented 9 months ago

The public API https://winstxnhdw-nllb-api.hf.space/api/v2/translate is gone... so you will have to run the API locally. Read more on https://github.com/winstxnhdw/nllb-api And also here: https://www.nikse.dk/subtitleedit/help#translation

Beta updated: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

@darnn: I'll try to add the old layout...

sharadagg commented 9 months ago

Just a quick test... how is this? https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

image

thank you so much! Works quite well.

https://github.com/SubtitleEdit/subtitleedit/assets/81471321/d0d5a680-b6c2-433b-abb9-babf22c3068b

niksedk commented 9 months ago

@sharadagg: thx for testing - LibraTranslate now also included: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

It's not 100% done, but it's close.

I don't think Python / Docker is for everyone, but it's not too hard.

darnn commented 9 months ago

@sharadagg How did you get nllb-serve working on Windows? I could only run it through wsl, since the version of jaxlib that can be installed in Windows isn't recent enough: RuntimeError: jaxlib version 0.4.16 is newer than and incompatible with jax version 0.4.6. Please update your jax and/or jaxlib packages.

For the record, it works for me with wsl, but the default version of nllb that it downloads is worse than the one that was used in https://winstxnhdw-nllb-api.hf.space/api/v2/translate, and I couldn't be bothered to try to download the larger one.

It would be interesting to see if this could be made to work: https://forum.opennmt.net/t/nllb-200-with-ctranslate2/5090

There is an actual Windows Ctranslate2 client for translation: https://github.com/ymoslem/DesktopTranslator/releases/tag/v0.2.1

But I could never actually get it to translate anything with the previous models I've tried, and right now I'm stuck at the point of trying to just download NLLB 3.3B, which is only 3 GB, much smaller than the API version (which is like 15 GB), but is taking hours for me to download.

sharadagg commented 9 months ago

@sharadagg How did you get nllb-serve working on Windows? I could only run it through wsl, since the version of jaxlib that can be installed in Windows isn't recent enough: RuntimeError: jaxlib version 0.4.16 is newer than and incompatible with jax version 0.4.6. Please update your jax and/or jaxlib packages.

For the record, it works for me with wsl, but the default version of nllb that it downloads is worse than the one that was used in https://winstxnhdw-nllb-api.hf.space/api/v2/translate, and I couldn't be bothered to try to download the larger one.

It would be interesting to see if this could be made to work: https://forum.opennmt.net/t/nllb-200-with-ctranslate2/5090

There is an actual Windows Ctranslate2 client for translation: https://github.com/ymoslem/DesktopTranslator/releases/tag/v0.2.1

But I could never actually get it to translate anything with the previous models I've tried, and right now I'm stuck at the point of trying to just download NLLB 3.3B, which is only 3 GB, much smaller than the API version (which is like 15 GB), but is taking hours for me to download.

Just followed the instructions here https://github.com/thammegowda/nllb-serve Its also linked in the beta UI done by @niksedk

sharadagg commented 9 months ago

@sharadagg: thx for testing - LibraTranslate now also included: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

It's not 100% done, but it's close.

I don't think Python / Docker is for everyone, but it's not too hard.

Nice! Will check that too. Agree python / docker maybe slightly more technical. But really happy to see this functionality included. It will certianly help a lot of people - certainly an NGO like ours which need to translate 100s of hours of content in 16+ languages every month.

darnn commented 9 months ago

Yeah, I mean, I did too, but when I tried running it (python -m nllb_serve -h), I got the error message I mentioned, since the Windows releases of Jaxlib are way behind the Linux ones, so I was just wondering how you got it to actually work.

Incidentally, https://winstxnhdw-nllb-api.hf.space/api/v2/translate still works for me in the previous beta.

niksedk commented 9 months ago

Both NLLB versions work for me... I had to uninstall the "Faster Whisper" Python version (but that's okay as I use Purfview's Faster Whisper anyway).

Incidentally, https://winstxnhdw-nllb-api.hf.space/api/v2/translate still works for me in the previous beta.

OK, so it's up again. I've added it again, but it's probably better to use a local version.

Latest beta now has an option for "Auto start web server" to improve the UX experience: image

Link: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

niksedk commented 9 months ago

Start of web server, is not a context menu in the "Auto-translate window":

image

sharadagg commented 9 months ago

Both NLLB versions work for me... I had to uninstall the "Faster Whisper" Python version (but that's okay as I use Purfview's Faster Whisper anyway).

Incidentally, https://winstxnhdw-nllb-api.hf.space/api/v2/translate still works for me in the previous beta.

OK, so it's up again. I've added it again, but it's probably better to use a local version.

Latest beta now has an option for "Auto start web server" to improve the UX experience: image

Link: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.1/SubtitleEditBeta.zip

Super! Absolutely amazing @niksedk

I have been experimenting with SeamlessM4T.. which opens up transcription, translations in many more langauges and speech generation. Public model is hosted here with http api access. https://replicate.com/cjwbw/seamless_communication/api

Anyone can use it for speech to text, text to text translation and text to speech. This will cover the whole cycle for us. We can potentially allow a user to create a fully dubbed audio through subtitle edit :)

User just would need to plugin their replicate.com access key. They can use a paid account - if they run out of their predictions quota on free account.

niksedk commented 9 months ago

I will look into the SeamlessM4T stuff, but do create a new issue/new issues. How well does it work compared to Whisper/Google translate?

I trying to start the docker container atm, but I only have 4G internet... seems to be like 5GB download at least!

sharadagg commented 9 months ago

I will look into the SeamlessM4T stuff, but do create a new issue/new issues. How well does it work compared to Whisper/Google translate?

I trying to start the docker container atm, but I only have 4G internet... seems to be like 5GB download at least!

Yes. It seems quite a bit better than whisper large-v2 and NLLB combination in various tests. See extensive comparison here - https://youtu.be/x8w5cNJSTWY?si=7eQmG01WM22gi8gT&t=123

I didn't use docker for this - just used replicate.com directly for running a few tests. Apparently overall its smaller than Whisper-Large v2 + NLLB combined.

darnn commented 9 months ago

Again, if you want to try and avoid self-hosting a server for the API, there's this: https://forum.opennmt.net/t/nllb-200-with-ctranslate2/5090 Which ostensibly only requires that it be loaded and called in Python. And the file you have to download for 3.3B is much smaller, only 3 GB. I basically managed to load things in Python based on the instructions there, but not to the point that I could figure out how to change the input and output language, so when I give it the input "Hello world!", what I get back is "eng_Latn Hello world!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" But perhaps to someone who knows the first thing about Python, unlike me, it will be easier to figure out. And there's also this windows GUI for CTranslate2, which should theoretically be able to use the files from the above link, but when I load them and try to translate I either get nothing or an error. But, again, if any of this can actually be made to work, it should theoretically be easier than self-hosting a server, and more reliable than an external API. I just wish I knew enough about this stuff to be of any help in it.

As for how well this works compared to Whisper/Google Translate: From playing around with both NLLB and SeamlessM4T, in both English-Hebrew and Hebrew-English it's somewhat worse than Google Translate, but still worth having, because there are specific individual phrases it sometimes translates better. Or at least differently enough that it's helpful. There's essentially no point in comparing with Whisper, since so much errors are introduced when it fails to recognize something accurately. And, of course, it can only translate into English.

winstxnhdw commented 4 months ago

Hey, recently nllb-api has undergone large changes. For the public API, there's now a caching layer which can help improve performance on repeated requests. But due to the sudden influx of usage from China, rate-limiting is also implemented.

I am still tuning the parameters so if there's anyone here still using my API, do let me know if you are frequently hitting rate limits. Ideally, you should be batching your requests to avoid being rate-limited too early.

However, if you are planning to self-host, especially on CPU, nllb-api is arguably the most CPU-optimised implementation right now.