Closed dgm3333 closed 10 months ago
I've updated some libraries on the build PC and converted to Win 11 and now working well - so potentially this was an external issue
Works in realtime using: cd C:\temp && C:\bin\stream.exe -m C:\bin\models\ggml-small.en.bin -c 0 -sa
I'm trying to get a live streaming.exe transcription (and/or command.exe) to work as accurately as main.exe when processing the same audio input. Ideally I would also like the input simultaneously transcribed and saved as a .wav file for future reprocessing although at this point I'm not attempting both simultaneously since even basic streaming transcription is not working.
If I record audio using a c++ SDL2 program to take input from the PC mic and save it as a wav file 16k, AUDIO_FORMAT = AUDIO_S16LSB then load it into whisper main.exe to transcribe it, then main.exe will transcribe slightly faster than real-time with reasonable accuracy (implying time isn't the limiting factor). Playing the same audio through the same microphone (or with normal voice) the transcription quality is significantly worse when using streaming.exe or command.exe, and even on the highspec machine there are chunks of audio which are totally ignored.
I've tried this on multiple Windows 10 PCs - including top end desktops (12 core + 64MB + GPU) and relative basic i5s with only 8MB and no GPU with the same difference. Tested both ad-hoc voice as well as playing a track from a speaker to the microphone so both inputs are identical. I've also had the same issue for every whisper version I've tried over the past year.
I've tried setting -keep-context = true
I've tried changing the following common-sdl.cpp settings with no success changing format:- AUDIO_F32; -> AUDIO_S16LSB changing buffer size:- capture_spec_requested.samples = 1024; -> 16384; boosting SDL thread priority: SDL_SetHintWithPriority(SDL_HINT_AUDIO_RESAMPLING_MODE, "medium", -> SDL_HINT_OVERRIDE);SDL_SetThreadPriority(SDL_THREAD_PRIORITY_HIGH); setting c++ thread priority: SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_ABOVE_NORMAL);