abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.16k stars 96 forks source link

Auto-translating? #48

Open mimbimbo opened 11 months ago

mimbimbo commented 11 months ago

First, thanks for this great tool! I'm getting it to work via the command line, but for some reason some videos are autotranslating from french (original language) to english. This is what I'm typing into the command line:

"C:\Users>subsai ./assets/video.mp4 --model openai/whisper --format srt

Any idea why some output srt files are autotranslated and some are in original language? I'm trying to get the original language files which are in french in this instance.

Thanks so much!!!

abdeladim-s commented 11 months ago

You are welcome @mimbimbo, Glad you found it useful.

Have you tried to set the language explicitly and not leave it as auto, maybe, depending on the model size, it may detect some words as English instead of French!

try this:

subsai ./assets/test1.mp4 --model openai/whisper --model-configs '{"language": "fr"}' --format srt 
mimbimbo commented 11 months ago

@abdeladim-s Thanks! I think you're right as there are a few english words. I tried your suggestion but am getting: subsai: error: unrecognized arguments: 'fr'}' . Any idea? I also tried "french" and got the same error.

abdeladim-s commented 11 months ago

@mimbimbo, I think you didn't use the quotes correctly. the syntax should be like Json inside single quote. Just copy/paste the command to be on the safe side.

mimbimbo commented 11 months ago

Sorry to be dense @abdeladim-s , but same error (I'm a JSON newb):

Here's the full output:

C:\Users\Mike>subsai ./assets/video.mp4 --model openai/whisper --model-configs '{"language": "fr"}' --format srt [10:05:04] WARNING The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not torch_audio_backend.py:19 supported on Windows. WARNING torchvision is not available - cannot save figures train_logger.py:227 WARNING The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not torch_audio_backend.py:19 supported on Windows.

███████╗██╗ ██╗██████╗ ███████╗ █████╗ ██╗ ██╔════╝██║ ██║██╔══██╗██╔════╝ ██╔══██╗██║ ███████╗██║ ██║██████╔╝███████╗ ███████║██║ ╚════██║██║ ██║██╔══██╗╚════██║ ██╔══██║██║ ███████║╚██████╔╝██████╔╝███████║ ██║ ██║██║ ╚══════╝ ╚═════╝ ╚═════╝ ╚══════╝ ╚═╝ ╚═╝╚═╝

Subs AI: Subtitles generation tool powered by OpenAI's Whisper and its variants. Version: 1.1.1

usage: subsai [-h] [--version] [-m MODEL] [-mc MODEL_CONFIGS] [-f FORMAT] [-df DESTINATION_FOLDER] [-tm TRANSLATION_MODEL] [-tsl TRANSLATION_SOURCE_LANG] [-ttl TRANSLATION_TARGET_LANG] [-tc TRANSLATION_CONFIGS] media_file [media_file ...] subsai: error: unrecognized arguments: fr}'

abdeladim-s commented 11 months ago

Oh wait, you are using Windows CMD! I thought it works like a Linux terminal, sorry for that. it has been a long time since I used it, I don't even remember how they handle quotes.

Could you please try these suggestions:

--model-configs {"language":"fr"}
--model-configs '{"""language""":"""fr"""}' 
--model-configs '{^"language^":^"fr^"}' 
--model-configs '{\"language\":\"fr\"}' 
mimbimbo commented 11 months ago

@abdeladim-s Amazing! I didn't realize I WASN'T using the powershell so the JSON wasn't working. I downloaded the latest Windows PowerShell and it works perfectly.

Thanks again -- this is a truly AWESOME tool for language learning as now I can go over the subs in French and then practice listening to just the audio.

Cheers!

abdeladim-s commented 11 months ago

You are welcome @mimbimbo, Great to hear that it finally works.

'Bon courage' with your learning journey :)

eartahhj commented 11 months ago

@abdeladim-s Thanks! I think you're right as there are a few english words. I tried your suggestion but am getting: subsai: error: unrecognized arguments: 'fr'}' . Any idea? I also tried "french" and got the same error.

In case others have the same situaiton, I had a similar issue with: unrecognized arguments: small}'

The solution for me on Windows 11 with CMD, seems to be: subsai \path\to\file.mp4 --model openai/whisper --model-configs "{\"model_type\": \"small\"}" --format srt

Credits to this StackOverflow topic

p.s. thanks for the work! It is first time I'm using this and I will try to play around with it!

abdeladim-s commented 11 months ago

@abdeladim-s Thanks! I think you're right as there are a few english words. I tried your suggestion but am getting: subsai: error: unrecognized arguments: 'fr'}' . Any idea? I also tried "french" and got the same error.

In case others have the same situaiton, I had a similar issue with: unrecognized arguments: small}'

The solution for me on Windows 11 with CMD, seems to be: subsai \path\to\file.mp4 --model openai/whisper --model-configs "{\"model_type\": \"small\"}" --format srt

Credits to this StackOverflow topic

p.s. thanks for the work! It is first time I'm using this and I will try to play around with it!

Thanks @eartahhj for posting your solution. it will certainly help if someone is running into the same issue.