Closed oatmealm closed 1 year ago
Pipewire (or audio server in general) should not be relevant for file operation. Your first command is indeed what whisper.el tries to do at first, but whether or not it works depends on whether ffmpeg was built with correct libraries for codecs, I think.
So, was the file you were trying to transcribe also an mp3 or was it some other format? Does that command work in terminal when you replace in.mp3
with the path to the actual file you are interested in?
hi @natrys ! yes, it was mp3 (192 kbps, 44100Hz, from what I can see). No, it doesn't work with either files....
No, it doesn't work with either files....
Do you mean that for whisper.el or ffmpeg in the terminal? If it's the latter, is there anything distinctive in the error message?
Do you mean that for whisper.el
Yes for whisper.el. Ffmpeg converts the file successfully
BTW, why is it saying "FFmpeg command failed to record audio" ... does it refer to the conversion?
Yeah that's a mistake. Should be fixed now, thanks.
Unfortunately I don't really have any ideas here as it seems to work for me for any valid file.
Only way we can make sense of this is if we can see the ffmpeg error log, which unfortunately isn't retained by default to avoid buffer clutter.
If you are still inclined to investigate, you could specify a buffer to contain stderr log of the failing ffmpeg process, and then inspect its content. For that you would need to add one line to the whisper--record-audio
function:
diff --git a/whisper.el b/whisper.el
index 3895a7a..2d8d770 100644
--- a/whisper.el
+++ b/whisper.el
(defun whisper--record-audio ()
"Start audio recording process in the background."
(with-current-buffer whisper--point-buffer
(setq whisper--marker (point-marker)))
(if whisper--ffmpeg-input-file
(message "[*] Pre-processing media file")
(message "[*] Recording audio"))
(setq whisper--recording-process
(make-process
:name "whisper-recording"
:command (whisper--record-command whisper--temp-file)
+ :stderr (get-buffer-create "*whisper-test*")
:connection-type nil
:buffer nil
:sentinel (lambda (_process event)
(cond ((or (string-equal "finished\n" event)
;; this is would be sane
(string-equal "terminated\n" event)
;; but this is reality
(string-equal "exited abnormally with code 255\n" event))
(whisper--transcribe-audio))
((string-equal "exited abnormally with code 1\n" event)
(if whisper--ffmpeg-input-file
(error "FFmpeg failed to convert given file")
(error "FFmpeg failed to record audio"))))))))
and then re-evaluate the function, run the test, and then see what the *whisper-test*
buffer says.
haaaa... No such file or directory ... that's strange, no?
Process whisper-recording stderr finished ffmpeg version 6.0 Copyright (c)
2000-2023 the FFmpeg developers built with gcc 13 (GCC) configuration:
--prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg
--docdir=/usr/share/doc/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64
--mandir=/usr/share/man --arch=x86_64 --optflags='-O2 -flto=auto
-ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall
-Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3
-Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
-fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64
-mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection
-fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer '
--extra-ldflags='-Wl,-z,relro -Wl,--as-needed -Wl,-z,now
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1
-specs=/usr/lib/rpm/redhat/redhat-package-notes ' --disable-htmlpages
--enable-pic --disable-stripping --enable-shared --disable-static --enable-gpl
--enable-version3 --enable-libsmbclient --disable-openssl --enable-bzlib
--enable-frei0r --enable-chromaprint --enable-gcrypt --enable-gnutls
--enable-ladspa --enable-lcms2 --enable-libshaderc --enable-vulkan
--disable-cuda-sdk --enable-libaom --enable-libass --enable-libbluray
--enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2
--enable-libdav1d --enable-libdc1394 --enable-libdrm --enable-libfdk-aac
--enable-libflite --enable-libfontconfig --enable-libfreetype
--enable-libfribidi --enable-libgme --enable-libgsm --enable-libiec61883
--enable-libilbc --enable-libjack --enable-libjxl --enable-libmodplug
--enable-libmp3lame --enable-libmysofa --enable-libopenh264-dlopen
--enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libplacebo
--enable-libpulse --enable-librabbitmq --enable-librav1e --enable-librist
--enable-librsvg --enable-librubberband --enable-libsnappy --enable-libsvtav1
--enable-libsoxr --enable-libspeex --enable-libssh --enable-libsrt
--enable-libtesseract --enable-libtheora --enable-libtwolame --enable-libvidstab
--enable-libvmaf --enable-libvorbis --enable-libv4l2 --enable-libvpx
--enable-libwebp --enable-libxml2 --enable-libzimg --enable-libzmq
--enable-libzvbi --enable-lto --enable-libvpl --enable-lv2 --enable-vaapi
--enable-vdpau --enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libvo-amrwbenc --enable-libxvid --enable-openal --enable-opencl
--enable-opengl --enable-pthreads --enable-vapoursynth --enable-muxers
--enable-demuxers --enable-hwaccels --disable-encoders --disable-decoders
--disable-decoder='h264,hevc,vc1'
--enable-encoder=',a64multi,a64multi5,aac,libfdk_aac,ac3,adpcm_adx,adpcm_argo,adpcm_g722,adpcm_g726,adpcm_g726le,adpcm_ima_alp,adpcm_ima_amv,adpcm_ima_apm,adpcm_ima_qt,adpcm_ima_ssi,adpcm_ima_wav,adpcm_ima_ws,adpcm_ms,adpcm_swf,adpcm_yamaha,alac,alias_pix,amv,anull,apng,ass,asv1,asv2,av1_amf,av1_nvenc,av1_qsv,ayuv,bitpacked,bmp,cinepak,cljr,dca,dfpwm,dnxhd,dpx,dvbsub,dvdsub,dvvideo,exr,ffv1,ffvhuff,flac,flashsv,flashsv2,flv,g723_1,gif,h261,h263,h263_v4l2m2m,h263p,h264_amf,h264_nvenc,h264_qsv,h264_v4l2m2m,h264_vaapi,hap,hdr,hevc_amf,hevc_nvenc,hevc_qsv,hevc_v4l2m2m,hevc_vaapi,huffyuv,ilbc,jpegls,jpeg2000,libaom,libaom_av1,libcodec2,libgsm,libgsm_ms,libilbc,libjxl,libmp3lame,libopencore_amrnb,libopenh264,libopenjpeg,libopus,librav1e,libschroedinger,libspeex,libsvtav1,libtheora,libtwolame,libvo_amrwbenc,libvorbis,libvpx_vp8,libvpx_vp9,libwebp,libwebp_anim,libxvid,mjpeg,mjpeg_qsv,mjpeg_vaapi,mlp,mp2,mp2fixed,mpeg1video,mpeg2video,mpeg2_qsv,mpeg2_vaapi,mpeg4,mpeg4_v4l2m2m,msmpeg4v2,msmpeg4v3,msvideo1,nellymoser,opus,pam,pbm,pcm_alaw,pcm_f32be,pcm_f32le,pcm_f64be,pcm_f64le,pcm_mulaw,pcm_s16be,pcm_s16be_planar,pcm_s16le,pcm_s16le_planar,pcm_s24be,pcm_s24le,pcm_s24le_planar,pcm_s32be,pcm_s32le,pcm_s32le_planar,pcm_s8,pcm_s8_planar,pcm_u16be,pcm_u16le,pcm_u24be,pcm_u24le,pcm_u32be,pcm_u32le,pcm_u8,pcx,pgm,pgmyuv,phm,png,ppm,qoi,qtrle,r10k,r210,ra_144,rawvideo,roq,roq_dpcm,rpza,rv10,rv20,s302m,sbc,sgi,smc,snow,sonic,sonic_ls,speedhq,srt,ssa,subrip,sunrast,svq1,targa,text,tiff,truehd,tta,ttml,utvideo,v210,v308,v408,v410,vc1_qsv,vc1_v4l2m2m,vc2,vnull,vorbis,vp8_qsv,vp8_v4l2m2m,vp8_vaapi,vp9_qsv,vp9_vaapi,wavpack,wbmp,webvtt,wmav1,wmav2,wmv1,wmv2,wrapped_avframe,xbm,xface,xsub,xwd,y41p,yuv4,zlib,zmbv,'
--enable-decoder=',aac,aasc,libfdk_aac,ac3,acelp_kelvin,adpcm_4xm,adpcm_adx,adpcm_afc,adpcm_agm,adpcm_aica,adpcm_argo,adpcm_ct,adpcm_dtk,adpcm_ea,adpcm_ea_maxis_xa,adpcm_ea_r1,adpcm_ea_r2,adpcm_ea_r3,adpcm_ea_xas,adpcm_g722,adpcm_g726,adpcm_g726le,adpcm_ima_acorn,adpcm_ima_alp,adpcm_ima_amv,adpcm_ima_apc,adpcm_ima_apm,adpcm_ima_cunning,adpcm_ima_dat4,adpcm_ima_dk3,adpcm_ima_dk4,adpcm_ima_ea_eacs,adpcm_ima_ea_sead,adpcm_ima_iss,adpcm_ima_moflex,adpcm_ima_mtf,adpcm_ima_oki,adpcm_ima_qt,adpcm_ima_qt_at,adpcm_ima_rad,adpcm_ima_smjpeg,adpcm_ima_ssi,adpcm_ima_wav,adpcm_ima_ws,adpcm_ms,adpcm_mtaf,adpcm_psx,adpcm_sbpro_2,adpcm_sbpro_3,adpcm_sbpro_4,adpcm_swf,adpcm_thp,adpcm_thp_le,adpcm_vima,adpcm_xa,adpcm_xmd,adpcm_yamaha,adpcm_zork,alac,alias_pix,amrnb,amrwb,amv,anm,ansi,anull,apac,ape,apng,arbc,argo,ass,asv1,asv2,atrac1,atrac3,atrac3al,atrac3p,atrac3pal,aura,aura2,av1,av1_qsv,ayuv,bethsoftvid,bfi,bink,binkaudio_dct,binkaudio_rdft,bintext,bitpacked,bmp,bmv_audio,bmv_video,bonk,brender_pix,c93,cbd2_dpcm,ccaption,cdgraphics,cdtoons,cdxl,cinepak,clearvideo,cljr,cook,cpia,cscd,cyuv,dca,dds,derf_dpcm,dfa,dfpwm,dirac,dnxhd,dolby_e,dpx,dsd_lsbf,dsd_msbf,dsicinaudio,dsicinvideo,dss_sp,dvaudio,dvbsub,dvdsub,dvvideo,dxa,dxtory,eacmv,eamad,eatgq,eatgv,eatqi,eightbps,eightsvx_exp,eightsvx_fib,escape124,escape130,evrc,exr,ffv1,ffvhuff,ffwavesynth,fits,flac,flashsv,flashsv2,flic,flv,fmvc,fourxm,ftr,g723_1,g729,gdv,gem,gif,gremlin_dpcm,gsm,gsm_ms,gsm_ms_at,h261,h263,h263_v4l2m2m,h263i,h263p,hap,hca,hcom,hdr,hnm4_video,hq_hqa,hqx,huffyuv,hymt,iac,idcin,idf,iff_ilbm,ilbc,imc,indeo2,indeo3,indeo4,indeo5,interplay_acm,interplay_dpcm,interplay_video,ipu,jacosub,jpeg2000,jpegls,jv,kgv1,kmvc,lagarith,libaom,libaom_av1,libcodec2,libdav1d,libgsm,libgsm_ms,libilbc,libjxl,libopencore_amrnb,libopencore_amrwb,libopenh264,libopenjpeg,libopus,librsvg,libschroedinger,libspeex,libvorbis,libvpx_vp8,libvpx_vp9,libzvbi_teletext,loco,lscr,m101,mace3,mace6,mdec,media100,metasound,microdvd,mimic,misc4,mjpeg,mjpeg_qsv,mjpegb,mlp,mmvideo,motionpixels,mp1,mp1float,mp2,mp2float,mp3,mp3adu,mp3adufloat,mp3float,mp3on4,mp3on4float,mpc7,mpc8,mpeg1video,mpeg1_v4l2m2m,mpeg2video,mpeg2_qsv,mpeg2_v4l2m2m,mpeg4,mpeg4_v4l2m2m,mpegvideo,mpl2,msa1,mscc,msmpeg4v1,msmpeg4v2,msmpeg4v3,msnsiren,msp2,msrle,mss1,mss2,msvideo1,mszh,mts2,mv30,mvc1,mvc2,mvdv,mvha,mwsc,mxpeg,nellymoser,nuv,on2avc,opus,paf_audio,paf_video,pam,pbm,pcm_alaw,pcm_bluray,pcm_dvd,pcm_f16le,pcm_f24le,pcm_f32be,pcm_f32le,pcm_f64be,pcm_f64le,pcm_lxf,pcm_mulaw,pcm_s16be,pcm_s16be_planar,pcm_s16le,pcm_s16le_planar,pcm_s24be,pcm_s24daud,pcm_s24le,pcm_s24le_planar,pcm_s32be,pcm_s32le,pcm_s32le_planar,pcm_s64be,pcm_s64le,pcm_s8,pcm_s8_planar,pcm_sga,pcm_u16be,pcm_u16le,pcm_u24be,pcm_u24le,pcm_u32be,pcm_u32le,pcm_u8,pcm_vidc,pcx,pfm,pgm,pgmyuv,pgssub,pgx,phm,photocd,pictor,pjs,png,ppm,prosumer,psd,ptx,qcelp,qdm2,qdmc,qdraw,qoi,qpeg,qtrle,r10k,r210,ra_144,ra_288,rasc,rawvideo,realtext,rka,rl2,roq,roq_dpcm,rpza,rscc,rv10,rv20,s302m,sami,sanm,sbc,screenpresso,sdx2_dpcm,sgi,sgirle,shorten,simbiosis_imx,sipr,siren,smackaud,smacker,smc,smvjpeg,snow,sol_dpcm,sonic,sp5x,speedhq,speex,srgc,srt,ssa,stl,subrip,subviewer,subviewer1,sunrast,svq1,svq3,tak,targa,targa_y216,tdsc,text,theora,thp,tiertexseqvideo,tiff,tmv,truehd,truemotion1,truemotion2,truemotion2rt,truespeech,tscc,tscc2,tta,twinvq,txd,ulti,utvideo,v210,v210x,v308,v408,v410,vb,vble,vcr1,vmdaudio,vmdvideo,vmnc,vnull,vorbis,vp3,vp4,vp5,vp6,vp6a,vp6f,vp7,vp8,vp8_qsv,vp8_v4l2m2m,vp9,vp9_qsv,vp9_v4l2m2m,vplayer,vqa,vqc,wady_dpcm,wavarc,wavpack,wbmp,wcmv,webp,webvtt,wmav1,wmav2,wmavoice,wmv1,wmv2,wnv1,wrapped_avframe,ws_snd1,xan_dpcm,xan_wc3,xan_wc4,xbin,xbm,xface,xl,xpm,xsub,xwd,y41p,ylc,yop,yuv4,zero12v,zerocodec,zlib,zmbv,'
libavutil 58. 2.100 / 58. 2.100 libavcodec 60. 3.100 / 60. 3.100 libavformat 60.
3.100 / 60. 3.100 libavdevice 60. 1.100 / 60. 1.100 libavfilter 9. 3.100 / 9.
3.100 libswscale 7. 1.100 / 7. 1.100 libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100 ~/Downloads/temp.mp3: No such
file or directory
Process whisper-recording stderr finished
ffprobe returns this (partial);
encoder : Lavf60.3.100
Duration: 00:51:38.77, start: 0.069063, bitrate: 48 kb/s
Stream #0:0: Audio: mp3, 16000 Hz, stereo, fltp, 48 kb/s
Ugh this was so stupid of me. I forgot that tilde expansion needs to be a thing ffs.
Basically, Linux deep down doesn't understand ~/Downloads/temp.mp3
. The actual path is /home/<user>/Downloads/test.mp3
. High level programs like shells uses ~
as a convention to truncate the boring common part, but before you run any command it implicitly expands the tilde before running it. Basically it was initially a Bourne shell thing, but was adopted by other shells, and even non-shell programs.
The make-process
API in Emacs is low level and it expects paths to be already expanded. I knew that, but I basically assumed that high-level read-file-name
that shows the file selection UI does the necessary expansion, but clearly not. Showing the file selection menu in terms of tilde, but not expanding it is a little anti-intuitive.
But anyway, I never caught it in my tests because I have always tried it on a file in /tmp/
dir so there was no tilde in need of expanding.
I have pushed a fix that solves it. Thanks a lot again.
There's also file-truename
which I believe also resolve symbolic links...
Thank you!
Yep. Working ! It'll take sometime I'm guessing :)
Hi. I'm able to run this from the command line on Fedora 38, so
PipeWire
withpipewire-pulse
installed and working.whisper-file
fails with:whisper-run
works amazingly well recording and transcribing.