Closed martjay closed 1 year ago
@martjay Currently, there’re two ways to implement batching.
Use the command-line application main.exe
from the cli.zip
archive, it can accept many input files. Will transcribe them one by one.
Write a custom application, and consume the DLL. I think the C# API is easier to use compared to the C++ COM API, use WhisperNet nuget package. Note the package requires modern version of .NET, 6.0 or newer. It’s not compatible with the legacy .NET framework 4.
About the desktop, look for third-party software which can create a fake virtual microphone, to capture audio output. However, the latency is not great in the current version, it’s several seconds.
@martjay Currently, there’re two ways to implement batching.
- Use the command-line application
main.exe
from thecli.zip
archive, it can accept many input files. Will transcribe them one by one.- Write a custom application, and consume the DLL. I think the C# API is easier to use compared to the C++ COM API, use WhisperNet nuget package. Note the package requires modern version of .NET, 6.0 or newer. It’s not compatible with the legacy .NET framework 4.
About the desktop, look for third-party software which can create a fake virtual microphone, to capture audio output. However, the latency is not great in the current version, it’s several seconds.
main.exe flashback after running, I can not program, sigh~
@martjay main.exe is a console application. Press Win+R, type cmd
, press Enter.
Use cd
command to navigate to the directory where you have the main.exe
First run main.exe -h
, it will print the list of supported command-line parameters with short description. Then run main.exe once more, this time specify the model, one or more input audio files, and some optional parameters. Example:
main.exe -m D:\Data\Whisper\ggml-medium.bin -otxt -nc -nt C:\Z\Fun\OpenAI\Whisper\SampleClips\jfk.wav
Attaching a screenshot.
@martjay main.exe is a console application. Press Win+R, type
cmd
, press Enter. Usecd
command to navigate to the directory where you have the main.exe First runmain.exe -h
, it will print the list of supported command-line parameters with short description. Then run main.exe once more, this time specify the model, one or more input audio files, and some optional parameters. Example:main.exe -m D:\Data\Whisper\ggml-medium.bin -otxt -nc -nt C:\Z\Fun\OpenAI\Whisper\SampleClips\jfk.wav
Attaching a screenshot.
'main.exe' is not an internal or external command, nor is it a runnable program or batch file.
@martjay You should download that program from Releases page of this repository, unpack cli.zip
somewhere, then use the cd
command to navigate to the folder which contains the unpacked main.exe
@martjay You should download that program from Releases page of this repository, unpack
cli.zip
somewhere, then use thecd
command to navigate to the folder which contains the unpackedmain.exe
I know, maybe there is something wrong with my computer, maybe it's a problem with the system environment variables
@martjay Are you running a 64-bit version of Windows 10 or 11? Press Win+pause/break key, you should see a window with the text “System Type: 64-bit operating system”
If that’s correct, this means you are in the wrong directory in the cmd.exe shell. You can use dir
command to list files in the current directory, you should see main.exe
and Whisper.dll
files in that directory.
@martjay Are you running a 64-bit version of Windows 10 or 11? Press Win+pause/break key, you should see a window with the text “System Type: 64-bit operating system”
If that’s correct, this means you are in the wrong directory in the cmd.exe shell. You can use
dir
command to list files in the current directory, you should seemain.exe
andWhisper.dll
files in that directory.
Powershell:
PS D:\Downloads\SOFTWARE\字幕软件\WhisperDesktop> main.exe - h main.exe : 无法将“main.exe”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。请检查名称的拼写,如果包括路径,请确保 路径正确,然后再试一次。 所在位置 行:1 字符: 1
+ CategoryInfo : ObjectNotFound: (main.exe:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Suggestion [3,General]: 找不到命令 main.exe,但它确实存在于当前位置。默认情况下,Windows PowerShell 不会从当前位置加载命令。如果信任此命令,请改为键入“.\main.exe”。有关详细信息,请参阅 "get-help about_Command_Precedence"。 PS D:\Downloads\SOFTWARE\字幕软件\WhisperDesktop>
Git bash:
Enya@DESKTOP-NIM9K91 MINGW64 /d/Downloads/SOFTWARE/字幕软件/WhisperDesktop $ dir Binary Whisper.dll main.exe output Include WhisperDesktop.exe model Library.zip WhisperDesktop.rar opt_01_vorproduktion.avi Linker cli.zip opt_01_vorproduktion.wav
Enya@DESKTOP-NIM9K91 MINGW64 /d/Downloads/SOFTWARE/字幕软件/WhisperDesktop $ main.exe -h bash: main.exe: command not found
Enya@DESKTOP-NIM9K91 MINGW64 /d/Downloads/SOFTWARE/字幕软件/WhisperDesktop $
I am using Windows 11 PRO X64
@martjay If you insist on using PowerShell instead of cmd.exe, use ./main.exe
instead of main.exe
. See the screenshot.
@martjay If you insist on using PowerShell instead of cmd.exe, use
./main.exe
instead ofmain.exe
. See the screenshot.
list worked, but
main.exe : 无法将“main.exe”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。请检查名称的拼写,如果包括路径,请确保 路径正确,然后再试一次。 所在位置 行:1 字符: 1
+ CategoryInfo : ObjectNotFound: (main.exe:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Suggestion [3,General]: 找不到命令 main.exe,但它确实存在于当前位置。默认情况下,Windows PowerShell 不会从当前位置加载命令。如果信任此命令,请改为键入“.\main.exe”。有关详细信息,请参阅 "get-help about_Command_Precedence"。 PS D:\Downloads\SOFTWARE\字幕软件\WhisperDesktop>
---- maybe it's a problem with the system environment variables
Now that it works, one more question, what symbol is used to separate multiple files?
Another headache, it sometimes has parts that are not recognized and then there is a big repetition
I use the command line to do batch tasks and find it too cumbersome, I would still hope you to implement this in the GUI, it would be very convenient for everyone, thank you, if you can.
@martjay That GUI also going to be very cumbersome. And completely different from the current WhisperDesktop. Ideally, need a wizard-like GUI when user populates a long list of input files by importing files or folders, then reviews the list and assigns unique output paths.
You can try the new PowerShell wrapper I’ve created in version 1.10. Might be easier to use for your use case, compared to the main.exe CLI.
ok i'm going to mention this again, might seem like double posting but see https://github.com/tigros/Whisperer. Oh and thank you Kosta, great coding!
ok i'm going to mention this again, might seem like double posting but see https://github.com/tigros/Whisperer. Oh and thank you Kosta, great coding!
What encoding do you use for your subtitles? I'm getting a mess when I choose to output Chinese subtitles. Also, why doesn't it automatically delete those .wav files after it finishes extracting the subtitles?
Selecting English subtitles is the same mess. But when I put the multimedia file into a path without Chinese, the subtitles are recognized properly without messing up, so I hope you can fix this. Another problem is that it doesn't delete .wav files automatically, and another problem is that it generates .wav files in the export directory whether I import .wav or not.
ok i'm on it, i have an idea.
ok i'm on it, i have an idea.
Hope you can add this feature too! Thank you man~
ok fixed unicode problem and delete wavs, but place file in same folder, will consider it.
i figured an easy way to do it, so same folder option is there now. thanks for suggestions!
i figured an easy way to do it, so same folder option is there now. thanks for suggestions!
Good job! Man, you are my hero. This software is very significant and will help many people to complete their studies. There are no words to describe how grateful I am to you all, you have done a very worthwhile job. Thank you again Const-me and Tigros!
ok fixed unicode problem and delete wavs, but place file in same folder, will consider it.
I also have a bold idea: to implement dual language subtitles, with machine translated subtitles on top and subtitles in the source language on the bottom. This would eliminate misunderstandings caused by inaccurate machine translations. Possible process: generate machine translated subtitles and source language subtitles, possibly taking twice as long to recognise, and then synthesise a bilingual subtitle through some program.
i figured an easy way to do it, so same folder option is there now. thanks for suggestions!
There are two bugs that need to be fixed.
it remembers paths now, v2.2. thanks.
it remembers paths now, v2.2. thanks.
Thank you man
I would like to add the ability to implement batch file tasks. Also if real time speech recognition is implemented with low latency, can we do a desktop captioning? That way we can watch videos with real time translation.