Open catbabylon opened 11 months ago
Thanks for the report.
Could you please run the app with --verbose
option?
Look for "scanning for opencl devices". Is there anything suspicious there?
Here's the only output it provided, the last line is what comes up when I went into settings and tried ticking the GPU option...
F: No installations directory in /etc/flatpak/installations.d. Skipping
F: Opening system flatpak installation at path /var/lib/flatpak
F: Opening user flatpak installation at path /home/catbabylon/.local/share/flatpak
F: Opening user flatpak installation at path /home/catbabylon/.local/share/flatpak
F: Opening system flatpak installation at path /var/lib/flatpak
F: Opening user flatpak installation at path /home/catbabylon/.local/share/flatpak
F: Opening system flatpak installation at path /var/lib/flatpak
F: /var/lib/flatpak/runtime/org.kde.Platform/x86_64/5.15-22.08/47c5cdec1996088a33fd192edd79b4169759f198d8663dcd55e3fa336f567085/files/lib32 does not exist
F: Allocated instance id 3556007332
F: Add defaults in dir /net/mkiol/SpeechNote/
F: Add locks in dir /net/mkiol/SpeechNote/
F: Allowing wayland access
F: Allowing pulseaudio access
F: Pulseaudio user configuration file '/home/catbabylon/.config/pulse/client.conf': Error opening file /home/catbabylon/.config/pulse/client.conf: No such file or directory
F: Running '/usr/bin/bwrap --args 45 /usr/bin/xdg-dbus-proxy --args=47'
F: Running '/usr/bin/bwrap --args 45 dsnote'
QSocketNotifier: Can only be used with threads started with QThread
qt.qpa.qgnomeplatform: Could not find color scheme ""
ALSA lib ../../oss/pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
** (dsnote:2): WARNING **: 17:39:56.971: atk-bridge: get_device_events_reply: unknown signature
Please run with --verbose
option. Like this:
flatpak run net.mkiol.SpeechNote --verbose
d'oh, my bad, I put --verbose in the wrong place. OK
[D] 17:54:58.421 0x7f88fa611d80 add_opencl_devices:303 - scanning for opencl devices
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:320 - opencl number of platforms: 2
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:345 - opencl platform: 0, name=AMD Accelerated Parallel Processing, vendor=Advanced Micro Devices, Inc.
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:359 - opencl number of devices: 0
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:345 - opencl platform: 1, name=Clover, vendor=Mesa
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:359 - opencl number of devices: 0
Thanks, that's much better :)
Before I answer, I have to admit I have never tested it with Intel GPU. Unfortunately all my computers are with AMD :(
Clearly OpenCL is not detected. Likely something is missing inside Flatpak runtime to support OpenCL for Intel. For example, to enable OpenCL for AMD, I had to pack additional AMD libraries. Maybe the same is needed for Intel. Due to Flatpak isolation, app cannot use libraries installed in your system, everything have to be packed into Flatpak runtime or into app package.
I will investigate what options are available.
and the equivalent flatpak OpenCL packages
Just curious. What packages exactly?
I should pay more attention to what I'm doing... I had searched flathub for opencl and ended up installing org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime because it had truncated the name so I hadn't seen that it was for Jellyfin...
So that flatpak contains basically what we Intel Gpu users need, to have proper OpenCl, would it possible to inegrate into Speech Note?
@barolo I'm not sure I understood the question correctly, but I believe yes. If Intel GPU supports OpenCL, acceleration should be possible (but only in Speech-to-Text). The problem here is Flatpak sandboxing. For this to work, all Intel OpenCL libraries must be placed in a Flatpak package.
I think something similar has already been done in Flatpak add-on for Jellyfin Server. I could copy this and make an add-on for Speech Note. That might work 🤔
@barolo I'm not sure I understood the question correctly, but I believe yes. If Intel GPU supports OpenCL, acceleration should be possible (but only in Speech-to-Text). The problem here is Flatpak sandboxing. For this to work, all Intel OpenCL libraries must be placed in a Flatpak package.
I think something similar has already been done in Flatpak add-on for Jellyfin Server. I could copy this and make an add-on for Speech Note. That might work 🤔
That's exactly what I meant :) and I only care about speech-to-text.
Is this also related to speeding up processing on Intel GPUs?
https://github.com/oneapi-src/oneDNN
I have intel Xe DG1 GPU installed with kernel 6.1 patched by out-of-tree i915 driver from intel.
https://github.com/intel-gpu/intel-gpu-i915-backports
The flatpak version does not have this oot driver related GL library files so flatpak dsnote does not start at all.
does have a line for DG1 support, this intel DG1 card seems to always need extra support.
Is this also related to speeding up processing on Intel GPUs? https://github.com/oneapi-src/oneDNN
In general yes, but not for Whisper right now. Speech Note uses two engines whisper.cpp
which doesn't use oneDNN
and ctranslate2
(via faster-whisper) which supports oneDNN
but in a very limited way, so the speed increase would be minimal.
The flatpak version does not have this oot driver related GL library files so flatpak dsnote does not start at all.
Only Speech Note flatpak or any flatpak app?
https://github.com/flathub/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime/blob/master/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime.yml does have a line for DG1 support, this intel DG1 card seems to always need extra support.
This is something that I have on my TO-DO list.
Is this also related to speeding up processing on Intel GPUs? https://github.com/oneapi-src/oneDNN
In general yes, but not for Whisper right now. Speech Note uses two engines
whisper.cpp
which doesn't useoneDNN
andctranslate2
(via faster-whisper) which supportsoneDNN
but in a very limited way, so the speed increase would be minimal.The flatpak version does not have this oot driver related GL library files so flatpak dsnote does not start at all.
Only Speech Note flatpak or any flatpak app?
https://github.com/flathub/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime/blob/master/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime.yml does have a line for DG1 support, this intel DG1 card seems to always need extra support.
This is something that I have on my TO-DO list.
whisper.cpp can use OpenVino though, which can use Intel's GPUs
whisper.cpp can use OpenVino though, which can use Intel's GPUs
Have you tried using whisper.cpp with OpenVino? It is working? Do you need any special configuration on your system to make it work?
Thanks for adding intel GPU support to your todo list!
Here is the log if I start dsnote with --verbose, it exit without GUI showing up on Debian 12 kernel 6.1 patched with intel oot i915 driver,
CPU: Ryzen 5 1500 Pro GPU: Intel Xe DG1 RAM: 16GB ECC
QIBusPlatformInputContext: invalid portal bus.
[I] 13:07:09.515 0x7f9335667d00 init:49 - logging to stderr enabled
[D] 13:07:09.515 0x7f9335667d00 () - version: 4.4.0
[D] 13:07:09.516 0x7f9335667d00 parse_cpuinfo:117 - cpu flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sev
[D] 13:07:09.516 0x7f9335667d00 parse_cpuinfo:125 - cpuinfo: processor-count=8, flags=[avx, avx2, fma, f16c, ]
[D] 13:07:09.516 0x7f9335667d00 () - translation: "en_US"
[W] 13:07:09.516 0x7f9335667d00 () - failed to install translation
[D] 13:07:09.516 0x7f9335667d00 () - starting standalone app
[D] 13:07:09.518 0x7f9335667d00 () - app: net.mkiol dsnote
[D] 13:07:09.518 0x7f9335667d00 () - config location: "/home/user/.var/app/net.mkiol.SpeechNote/config"
[D] 13:07:09.518 0x7f9335667d00 () - data location: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote"
[D] 13:07:09.518 0x7f9335667d00 () - cache location: "/home/user/.var/app/net.mkiol.SpeechNote/cache/net.mkiol/dsnote"
[D] 13:07:09.518 0x7f9335667d00 () - settings file: "/home/user/.var/app/net.mkiol.SpeechNote/config/net.mkiol/dsnote/settings.conf"
[D] 13:07:09.518 0x7f9335667d00 () - platform: "xcb"
[D] 13:07:09.518 0x7f9335667d00 () - enforcing num threads: 0
[D] 13:07:09.657 0x7f9335667d00 () - supported audio input devices:
ALSA lib ../../oss/pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
[D] 13:07:09.662 0x7f9335667d00 () - "pulse"
[D] 13:07:09.665 0x7f9335667d00 () - "default"
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
[D] 13:07:09.852 0x7f9335667d00 () - "auto_null.monitor"
[D] 13:07:10.101 0x7f9335667d00 () - starting service: app-standalone
[D] 13:07:10.107 0x7f9335667d00 () - mbrola dir: "/app/bin"
[D] 13:07:10.107 0x7f9335667d00 () - espeak dir: "/app/bin"
[D] 13:07:10.107 0x7f9335667d00 () - module checksum missing, need to unpack: "rhvoicedata"
[D] 13:07:10.107 0x7f9335667d00 () - unpacking module: "rhvoicedata"
[D] 13:07:10.107 0x7f931f716600 loop:75 - py executor loop started
[D] 13:07:10.125 0x7f9335667d00 () - extracting xz archive: "/app/share/dsnote/rhvoicedata.tar.xz"
[D] 13:07:10.127 0x7f931ff17600 () - config version: 65 65
[D] 13:07:10.197 0x7f931ff17600 () - models changed
[D] 13:07:10.427 0x7f9335667d00 () - xz decoded, stats: size= 565083
6 , duration= 301 , threads= 6
[D] 13:07:10.427 0x7f9335667d00 () - extracting archive: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/rhvoicedata.tar"
[D] 13:07:10.599 0x7f9335667d00 () - module successfully unpacked: "rhvoicedata"
[D] 13:07:10.602 0x7f9335667d00 () - module already unpacked: "rhvoicedata"
[D] 13:07:10.602 0x7f9335667d00 () - module checksum missing, need to unpack: "rhvoiceconfig"
[D] 13:07:10.602 0x7f9335667d00 () - unpacking module: "rhvoiceconfig"
[D] 13:07:10.611 0x7f9335667d00 () - extracting xz archive: "/app/share/dsnote/rhvoiceconfig.tar.xz"
[D] 13:07:10.613 0x7f9335667d00 () - xz decoded, stats: size= 2396 , duration= 1 , threads= 6
[D] 13:07:10.613 0x7f9335667d00 () - extracting archive: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/rhvoiceconfig.tar"
[D] 13:07:10.613 0x7f9335667d00 () - module successfully unpacked: "rhvoiceconfig"
[D] 13:07:10.614 0x7f9335667d00 () - module already unpacked: "rhvoiceconfig"
[D] 13:07:10.614 0x7f9335667d00 () - module checksum missing, need to unpack: "espeakdata"
[D] 13:07:10.614 0x7f9335667d00 () - unpacking module: "espeakdata"
[D] 13:07:10.635 0x7f9335667d00 () - extracting xz archive: "/app/share/dsnote/espeakdata.tar.xz"
[D] 13:07:11.26 0x7f9335667d00 () - xz decoded, stats: size= 6720012 , duration= 391 , threads= 6
[D] 13:07:11.26 0x7f9335667d00 () - extracting archive: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/espeakdata.tar"
[D] 13:07:11.79 0x7f9335667d00 () - module successfully unpacked: "espeakdata"
[D] 13:07:11.82 0x7f9335667d00 () - module already unpacked: "espeakdata"
[D] 13:07:11.92 0x7f9335667d00 () - default stt model not found: "en"
[D] 13:07:11.92 0x7f9335667d00 () - default tts model not found: "en"
[D] 13:07:11.92 0x7f9335667d00 () - default mnt lang not found: "en"
[D] 13:07:11.92 0x7f9335667d00 () - new default mnt lang: "en"
[D] 13:07:11.92 0x7f9335667d00 () - service refresh status, new state: busy
[D] 13:07:11.92 0x7f9335667d00 () - service state changed: unknown => busy
[D] 13:07:11.93 0x7f9335667d00 () - features availability ready
[W] 13:07:11.93 0x7f9335667d00 has_lib:477 - failed to open libcudart.so: libcudart.so: cannot open shared object file: No such file or directory
[W] 13:07:11.93 0x7f9335667d00 has_lib:477 - failed to open libcudnn.so: libcudnn.so: cannot open shared object file: No such file or directory
[W] 13:07:11.93 0x7f9335667d00 has_lib:477 - failed to open libcudnn.so.8: libcudnn.so.8: cannot open shared object file: No such file or directory
sh: line 1: perl: command not found
[W] 13:07:11.106 0x7f9335667d00 has_cuda:56 - failed to open whisper-cublas lib: libwhisper-cublas.so: cannot open shared object file: No such file or directory
[W] 13:07:11.106 0x7f9335667d00 has_hip:80 - failed to open whisper-hipblas lib: libwhisper-hipblas.so: cannot open shared object file: No such file or directory
[D] 13:07:11.126 0x7f9335667d00 () - updating model using availability
[D] 13:07:11.126 0x7f9335667d00 () - updating model using availability internal
[D] 13:07:11.127 0x7f9335667d00 () - default stt model not found: "en"
[D] 13:07:11.127 0x7f9335667d00 () - default tts model not found: "en"
[D] 13:07:11.127 0x7f9335667d00 () - default mnt lang not found: "en"
[D] 13:07:11.127 0x7f9335667d00 () - new default mnt lang: "en"
[D] 13:07:11.127 0x7f9335667d00 () - service refresh status, new state: not-configured
[D] 13:07:11.127 0x7f9335667d00 () - service state changed: busy => not-configured
[D] 13:07:11.127 0x7f9335667d00 () - scan cuda: true
[D] 13:07:11.127 0x7f9335667d00 () - scan hip: true
[D] 13:07:11.127 0x7f9335667d00 () - scan opencl: true false
[D] 13:07:11.127 0x7f9335667d00 add_cuda_devices:281 - scanning for cuda devices
[W] 13:07:11.127 0x7f9335667d00 cuda_api:168 - failed to open cudart lib: libcudart.so: cannot open shared object file: No such file or directory
[D] 13:07:11.128 0x7f9335667d00 add_hip_devices:318 - scanning for hip devices
[W] 13:07:11.128 0x7f9335667d00 hip_api:223 - failed to open hip lib: libamdhip64.so: cannot open shared object file: No such file or directory
[D] 13:07:11.128 0x7f9335667d00 add_opencl_devices:357 - scanning for opencl devices
I use intel oot i915 because upstream kernel does not have complete support for intel Xe GPU dg1, upstream kernel works on X11 but hardware acceleration is missing. Originally I did this for ffmpeg qsv hardware encoding, ffmpeg and jellyfin confirmed working.
Only Speech Note flatpak or any flatpak app?
This is what I meant by saying missing mesa GL related library files.
Jellyfin somehow as you see added a line DG1 support in their flatpak, so flatpak Jellyfin works. I tried flatpak obs studio also, it showed the same exit without GUI showing up thing as dsnote. I guess because both obs studio and dsnote at the moment rely on standard flatpak runtime which does not have patched GL library files.
For example, intel maintains a page where you can see they compiled special mesa GL related packages,
https://dgpu-docs.intel.com/releases/LTS_803.45_20240426.html
this package is crucial libgl1-mesa-dri
, couple months ago, when I played with Debian not wanting to switch to Ubuntu, I just installed this package and pulled related deps then DG1 hardware encoding on Debian became working.
Edit: Another path would be skipping i915 and oot driver all together and rely on standard runtime. Since 6.8, linux kernel has introduced a new Xe driver for intel GPUs and currently Debian has 6.8.x in unstable branch, I have not tried unstable debian kernel yet though.
whisper.cpp can use OpenVino though, which can use Intel's GPUs
Have you tried using whisper.cpp with OpenVino? It is working? Do you need any special configuration on your system to make it work?
I googled the openvino + intel gpu and found these, would these help with dsnote integration??
https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Accelerate-Workloads-with-OpenVINO-and-OneDNN/post/1503867 https://blog.openvino.ai/blog-posts/techniques-for-faster-ai-inference-throughput-with-openvino-on-intel-gpus https://blog.openvino.ai/blog-posts/optimizing-whisper-and-distil-whisper-for-speech-recognition-with-openvino-and-nncf https://huggingface.co/Intel/whisper.cpp-openvino-models/tree/main
Intel's original models are huge, and someone did some work to trim them down a little bit, (still huge)
https://github.com/zhuzilin/whisper-openvino https://huggingface.co/twdragon/whisper.cpp-openvino/tree/main
whisper.cpp can use OpenVino though, which can use Intel's GPUs
Have you tried using whisper.cpp with OpenVino? It is working? Do you need any special configuration on your system to make it work?
Yes, I have it working. I'm using it to live-transcribe podcasts in MPV player amongst other things, using GPU. it's much more efficient. It was some time ago since I've had it set-up, I had to compile Whisper with support for OV. Then there's command line switch to pick device, and that was it.
Yes, I have it working.
Thanks for the confirmation. I will try to add support to OpenVINO.
Since whisper.cpp
is already in the Speech Note in 4 different versions (OpenBLAS, OpenBLAS+AVX, CUDA and hipBLAS), so I don't expect that adding another one would be very changeling 😄
The only potential problems I see is Flatpak sandboxing. That needs to be verified, but potentially OpenVINO might just don't work due to isolation that Flatpak provides.
Hey, letting know about related changes added in version 4.6.0 Beta 1:
- I don't have a system with a discrete Intel GPU, so I haven't been able to test this, but with an integrated Intel GPU the speed results are similar to the Intel CPU or worse. It seems that the iGPU does not provide anything of value for STT when using OpenVINO. At least these are my observations
The value of using GPU is to make use of it and to offload CPU, I don't care about the speed in such case, that's not the point. It should be more energy efficient too, which matters on mobile devices.
The value of using GPU is to make use of it and to offload CPU
Fair point. Indeed, it may be an advantage.
Just letting know that v4.6.0 is out with "OpenVINO CPU-only" support. I will try to add GPU support in the next version(s).
The new version should be available in Flathub tomorrow.
On stable and beta versions, it is saying that a suitable GPU isn't available. I've installed OpenCL packages on Fedora 38 and the equivalent flatpak OpenCL packages but it still says not available.
I get that it might be the case that it wouldn't be useful given it isn't a powerful discrete gpu, but wondered if it might be a bug causing it to report as unavailable.