Closed jayeheffernan closed 8 months ago
@jayeheffernan thanks for the PR and comprehensive debug of the issue.
I've tweaked it slightly so that whisper_rec_cmd
can be fully customized. If you hit cropping issues with sox (which can happen if recording sound recording device has high latency), you could go back to ffmpeg with manually chosen device.
Hello and thanks for the plugin! I had a little issue getting whisper to work, so submitting a fix your consideration.
This PR adds a way to manually specify in the config which command (
sox
,ffmpeg
, orarecord
) should be used for recording for commands likeGpWhisper
. E.g. use it in.setup()
likewhisper_rec_cmd = 'sox'
.I had an issue trying to use
GpWhisper
, where the output would always be just "you". I found the recordings,rec.wav
, were always the correct length, but only silence. I think the problem is the options toffmpeg
select audio input device:0
, which doesn't work in my case. Modifyinggp/init.lua
to always chooserec_cmd = "sox"
works fine for me. There's probably some way to look into the audio input devices more and improve the autodetection, but I'm not sure how to do that well, and thinking that this may not be a common issue anyway.Debugging my issue with audio devices...
Here's some info from a terminal session of me figuring out what was going on, if it helps. ## Screenshot with notes ## Raw text output ```txt /tmp/gp_whisper ❯ ffmpeg -devices -v quiet | grep -i avfoundation | wc -l 11:47:11 1 /tmp/gp_whisper ❯ ffmpeg -devices -v quiet | grep -i avfoundation 11:50:53 D avfoundation AVFoundation input device /tmp/gp_whisper ❯ ffmpeg -devices -v quiet 11:50:59 Devices: D. = Demuxing supported .E = Muxing supported -- E audiotoolbox AudioToolbox output device D avfoundation AVFoundation input device D lavfi Libavfilter virtual input device E sdl,sdl2 SDL2 output device D x11grab X11 screen capture, using XCB /tmp/gp_whisper ❯ ffmpeg -f avfoundation -list_devices true -i "" 11:51:10 ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers built with Apple clang version 14.0.3 (clang-1403.0.22.14.1) configuration: --prefix=/usr/local/Cellar/ffmpeg/6.0_1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox libavutil 58. 2.100 / 58. 2.100 libavcodec 60. 3.100 / 60. 3.100 libavformat 60. 3.100 / 60. 3.100 libavdevice 60. 1.100 / 60. 1.100 libavfilter 9. 3.100 / 9. 3.100 libswscale 7. 1.100 / 7. 1.100 libswresample 4. 10.100 / 4. 10.100 libpostproc 57. 1.100 / 57. 1.100 [AVFoundation indev @ 0x7fe4d6f04a00] AVFoundation video devices: [AVFoundation indev @ 0x7fe4d6f04a00] [0] FaceTime HD Camera (Built-in) [AVFoundation indev @ 0x7fe4d6f04a00] [1] LG UltraFine Display Camera [AVFoundation indev @ 0x7fe4d6f04a00] [2] Snap Camera [AVFoundation indev @ 0x7fe4d6f04a00] [3] Capture screen 0 [AVFoundation indev @ 0x7fe4d6f04a00] [4] Capture screen 1 [AVFoundation indev @ 0x7fe4d6f04a00] [5] Capture screen 2 [AVFoundation indev @ 0x7fe4d6f04a00] AVFoundation audio devices: [AVFoundation indev @ 0x7fe4d6f04a00] [0] ZoomAudioDevice [AVFoundation indev @ 0x7fe4d6f04a00] [1] MacBook Pro Microphone [AVFoundation indev @ 0x7fe4d6f04a00] [2] LG UltraFine Display Audio ```Screenshot of new error message in action
If you pick an invalid value, you'll find out when you try to record: