tomchang25 / whisper-auto-transcribe

Auto transcribe tool based on whisper
MIT License
220 stars 15 forks source link

v3.1 batch conversion #5

Closed Royhowtohack closed 1 year ago

Royhowtohack commented 1 year ago

Is it possible to convert multiple audio files at the same time?

amerodeh commented 1 year ago

Maybe this will fit your use case. My use case: I'm on Windows, I have folders, and in each one there are multiple video files in Japanese, and I wish to generate subtitles for each of them. I therefore came up with this small command line script, that will run the translations in sequence, one after the other:

FOR %i IN ("FILES_PATH\*.mp4") DO python PROGRAM_PATH\cli.py %i --output %~dpni.srt -lang ja --task translate --model large --device cuda

Replace FILES_PATH with the path to your video/audio files (e.g. C:\j\ABC-123). Replace PROGRAM_PATH with the path to the repo (e.g. C:\tools\whisper-auto-transcribe). Replace the extension of the wildcard (*.mp4) as you see fit. The other parts of this command are:

The rest of the parameters are from the README.md file and from the cli.py file, you can find their explanations there.

Royhowtohack commented 1 year ago

Thank you very much!! Since i'm using MacOs, I converted your code into AppleScript:

set folderPath to "/path/to/your/video/files/"
set scriptPath to "/path/to/your/python/script/"

tell application "Finder"
    set fileList to every file in folder folderPath whose name extension is "mp4"
end tell

repeat with videoFile in fileList
    set videoPath to POSIX path of (videoFile as alias)
    set srtPath to (text 1 thru -5 of videoPath) & "srt"

    do shell script "/usr/bin/python " & scriptPath & "cli.py " & ¬
        quoted form of videoPath & " --output " & quoted form of srtPath & " -lang ja --task translate --model large --device cuda"
end repeat
amerodeh commented 1 year ago

Nice. Since my comment, I've made 2 small improvements to the script, the complete command is now FOR %i IN ("FILES_PATH\*.mp4") DO IF NOT EXIST "%~dpni.srt" python PROGRAM_PATH\cli.py "%i" --output "%~dpni.srt" -lang ja --task translate --model large --device cuda The two changes are:

tomchang25 commented 1 year ago

CLI support batch now.