mkiol / dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Mozilla Public License 2.0
494 stars 20 forks source link

Should OpenCL work on an Ice Lake v11 intel processor? #49

Open catbabylon opened 11 months ago

catbabylon commented 11 months ago

On stable and beta versions, it is saying that a suitable GPU isn't available. I've installed OpenCL packages on Fedora 38 and the equivalent flatpak OpenCL packages but it still says not available.

I get that it might be the case that it wouldn't be useful given it isn't a powerful discrete gpu, but wondered if it might be a bug causing it to report as unavailable.

mkiol commented 11 months ago

Thanks for the report.

Could you please run the app with --verbose option?

Look for "scanning for opencl devices". Is there anything suspicious there?

catbabylon commented 11 months ago

Here's the only output it provided, the last line is what comes up when I went into settings and tried ticking the GPU option...

F: No installations directory in /etc/flatpak/installations.d. Skipping
F: Opening system flatpak installation at path /var/lib/flatpak
F: Opening user flatpak installation at path /home/catbabylon/.local/share/flatpak
F: Opening user flatpak installation at path /home/catbabylon/.local/share/flatpak
F: Opening system flatpak installation at path /var/lib/flatpak
F: Opening user flatpak installation at path /home/catbabylon/.local/share/flatpak
F: Opening system flatpak installation at path /var/lib/flatpak
F: /var/lib/flatpak/runtime/org.kde.Platform/x86_64/5.15-22.08/47c5cdec1996088a33fd192edd79b4169759f198d8663dcd55e3fa336f567085/files/lib32 does not exist
F: Allocated instance id 3556007332
F: Add defaults in dir /net/mkiol/SpeechNote/
F: Add locks in dir /net/mkiol/SpeechNote/
F: Allowing wayland access
F: Allowing pulseaudio access
F: Pulseaudio user configuration file '/home/catbabylon/.config/pulse/client.conf': Error opening file /home/catbabylon/.config/pulse/client.conf: No such file or directory
F: Running '/usr/bin/bwrap --args 45 /usr/bin/xdg-dbus-proxy --args=47'
F: Running '/usr/bin/bwrap --args 45 dsnote'
QSocketNotifier: Can only be used with threads started with QThread
qt.qpa.qgnomeplatform: Could not find color scheme  ""
ALSA lib ../../oss/pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)

** (dsnote:2): WARNING **: 17:39:56.971: atk-bridge: get_device_events_reply: unknown signature
mkiol commented 11 months ago

Please run with --verbose option. Like this:

flatpak run net.mkiol.SpeechNote --verbose
catbabylon commented 11 months ago

d'oh, my bad, I put --verbose in the wrong place. OK

[D] 17:54:58.421 0x7f88fa611d80 add_opencl_devices:303 - scanning for opencl devices
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:320 - opencl number of platforms: 2
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:345 - opencl platform: 0, name=AMD Accelerated Parallel Processing, vendor=Advanced Micro Devices, Inc.
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:359 - opencl number of devices: 0
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:345 - opencl platform: 1, name=Clover, vendor=Mesa
[D] 17:54:58.500 0x7f88fa611d80 add_opencl_devices:359 - opencl number of devices: 0
mkiol commented 11 months ago

Thanks, that's much better :)

Before I answer, I have to admit I have never tested it with Intel GPU. Unfortunately all my computers are with AMD :(

Clearly OpenCL is not detected. Likely something is missing inside Flatpak runtime to support OpenCL for Intel. For example, to enable OpenCL for AMD, I had to pack additional AMD libraries. Maybe the same is needed for Intel. Due to Flatpak isolation, app cannot use libraries installed in your system, everything have to be packed into Flatpak runtime or into app package.

I will investigate what options are available.

and the equivalent flatpak OpenCL packages

Just curious. What packages exactly?

catbabylon commented 11 months ago

I should pay more attention to what I'm doing... I had searched flathub for opencl and ended up installing org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime because it had truncated the name so I hadn't seen that it was for Jellyfin...

barolo commented 7 months ago

So that flatpak contains basically what we Intel Gpu users need, to have proper OpenCl, would it possible to inegrate into Speech Note?

mkiol commented 7 months ago

@barolo I'm not sure I understood the question correctly, but I believe yes. If Intel GPU supports OpenCL, acceleration should be possible (but only in Speech-to-Text). The problem here is Flatpak sandboxing. For this to work, all Intel OpenCL libraries must be placed in a Flatpak package.

I think something similar has already been done in Flatpak add-on for Jellyfin Server. I could copy this and make an add-on for Speech Note. That might work 🤔

barolo commented 7 months ago

@barolo I'm not sure I understood the question correctly, but I believe yes. If Intel GPU supports OpenCL, acceleration should be possible (but only in Speech-to-Text). The problem here is Flatpak sandboxing. For this to work, all Intel OpenCL libraries must be placed in a Flatpak package.

I think something similar has already been done in Flatpak add-on for Jellyfin Server. I could copy this and make an add-on for Speech Note. That might work 🤔

That's exactly what I meant :) and I only care about speech-to-text.

h9j6k commented 4 months ago

Is this also related to speeding up processing on Intel GPUs?

https://github.com/oneapi-src/oneDNN

I have intel Xe DG1 GPU installed with kernel 6.1 patched by out-of-tree i915 driver from intel.

https://github.com/intel-gpu/intel-gpu-i915-backports

The flatpak version does not have this oot driver related GL library files so flatpak dsnote does not start at all.

https://github.com/flathub/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime/blob/master/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime.yml

does have a line for DG1 support, this intel DG1 card seems to always need extra support.

mkiol commented 4 months ago

Is this also related to speeding up processing on Intel GPUs? https://github.com/oneapi-src/oneDNN

In general yes, but not for Whisper right now. Speech Note uses two engines whisper.cpp which doesn't use oneDNN and ctranslate2 (via faster-whisper) which supports oneDNN but in a very limited way, so the speed increase would be minimal.

The flatpak version does not have this oot driver related GL library files so flatpak dsnote does not start at all.

Only Speech Note flatpak or any flatpak app?

https://github.com/flathub/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime/blob/master/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime.yml does have a line for DG1 support, this intel DG1 card seems to always need extra support.

This is something that I have on my TO-DO list.

barolo commented 4 months ago

Is this also related to speeding up processing on Intel GPUs? https://github.com/oneapi-src/oneDNN

In general yes, but not for Whisper right now. Speech Note uses two engines whisper.cpp which doesn't use oneDNN and ctranslate2 (via faster-whisper) which supports oneDNN but in a very limited way, so the speed increase would be minimal.

The flatpak version does not have this oot driver related GL library files so flatpak dsnote does not start at all.

Only Speech Note flatpak or any flatpak app?

https://github.com/flathub/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime/blob/master/org.jellyfin.JellyfinServer.Plugin.IntelComputeRuntime.yml does have a line for DG1 support, this intel DG1 card seems to always need extra support.

This is something that I have on my TO-DO list.

whisper.cpp can use OpenVino though, which can use Intel's GPUs

mkiol commented 4 months ago

whisper.cpp can use OpenVino though, which can use Intel's GPUs

Have you tried using whisper.cpp with OpenVino? It is working? Do you need any special configuration on your system to make it work?

h9j6k commented 4 months ago

Thanks for adding intel GPU support to your todo list!

Here is the log if I start dsnote with --verbose, it exit without GUI showing up on Debian 12 kernel 6.1 patched with intel oot i915 driver,

CPU: Ryzen 5 1500 Pro GPU: Intel Xe DG1 RAM: 16GB ECC

QIBusPlatformInputContext: invalid portal bus.
[I] 13:07:09.515 0x7f9335667d00 init:49 - logging to stderr enabled
[D] 13:07:09.515 0x7f9335667d00 () - version: 4.4.0
[D] 13:07:09.516 0x7f9335667d00 parse_cpuinfo:117 - cpu flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sev
[D] 13:07:09.516 0x7f9335667d00 parse_cpuinfo:125 - cpuinfo: processor-count=8, flags=[avx, avx2, fma, f16c, ]
[D] 13:07:09.516 0x7f9335667d00 () - translation: "en_US"
[W] 13:07:09.516 0x7f9335667d00 () - failed to install translation
[D] 13:07:09.516 0x7f9335667d00 () - starting standalone app
[D] 13:07:09.518 0x7f9335667d00 () - app: net.mkiol dsnote
[D] 13:07:09.518 0x7f9335667d00 () - config location: "/home/user/.var/app/net.mkiol.SpeechNote/config"
[D] 13:07:09.518 0x7f9335667d00 () - data location: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote"
[D] 13:07:09.518 0x7f9335667d00 () - cache location: "/home/user/.var/app/net.mkiol.SpeechNote/cache/net.mkiol/dsnote"
[D] 13:07:09.518 0x7f9335667d00 () - settings file: "/home/user/.var/app/net.mkiol.SpeechNote/config/net.mkiol/dsnote/settings.conf"
[D] 13:07:09.518 0x7f9335667d00 () - platform: "xcb"
[D] 13:07:09.518 0x7f9335667d00 () - enforcing num threads: 0
[D] 13:07:09.657 0x7f9335667d00 () - supported audio input devices:
ALSA lib ../../oss/pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
[D] 13:07:09.662 0x7f9335667d00 () - "pulse"
[D] 13:07:09.665 0x7f9335667d00 () - "default"
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
[D] 13:07:09.852 0x7f9335667d00 () - "auto_null.monitor"
[D] 13:07:10.101 0x7f9335667d00 () - starting service: app-standalone
[D] 13:07:10.107 0x7f9335667d00 () - mbrola dir: "/app/bin"
[D] 13:07:10.107 0x7f9335667d00 () - espeak dir: "/app/bin"
[D] 13:07:10.107 0x7f9335667d00 () - module checksum missing, need to unpack: "rhvoicedata"
[D] 13:07:10.107 0x7f9335667d00 () - unpacking module: "rhvoicedata"
[D] 13:07:10.107 0x7f931f716600 loop:75 - py executor loop started
[D] 13:07:10.125 0x7f9335667d00 () - extracting xz archive: "/app/share/dsnote/rhvoicedata.tar.xz"
[D] 13:07:10.127 0x7f931ff17600 () - config version: 65 65
[D] 13:07:10.197 0x7f931ff17600 () - models changed
[D] 13:07:10.427 0x7f9335667d00 () - xz decoded, stats: size= 565083
6 , duration= 301 , threads= 6
[D] 13:07:10.427 0x7f9335667d00 () - extracting archive: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/rhvoicedata.tar"
[D] 13:07:10.599 0x7f9335667d00 () - module successfully unpacked: "rhvoicedata"
[D] 13:07:10.602 0x7f9335667d00 () - module already unpacked: "rhvoicedata"
[D] 13:07:10.602 0x7f9335667d00 () - module checksum missing, need to unpack: "rhvoiceconfig"
[D] 13:07:10.602 0x7f9335667d00 () - unpacking module: "rhvoiceconfig"
[D] 13:07:10.611 0x7f9335667d00 () - extracting xz archive: "/app/share/dsnote/rhvoiceconfig.tar.xz"
[D] 13:07:10.613 0x7f9335667d00 () - xz decoded, stats: size= 2396 , duration= 1 , threads= 6
[D] 13:07:10.613 0x7f9335667d00 () - extracting archive: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/rhvoiceconfig.tar"
[D] 13:07:10.613 0x7f9335667d00 () - module successfully unpacked: "rhvoiceconfig"
[D] 13:07:10.614 0x7f9335667d00 () - module already unpacked: "rhvoiceconfig"
[D] 13:07:10.614 0x7f9335667d00 () - module checksum missing, need to unpack: "espeakdata"
[D] 13:07:10.614 0x7f9335667d00 () - unpacking module: "espeakdata"
[D] 13:07:10.635 0x7f9335667d00 () - extracting xz archive: "/app/share/dsnote/espeakdata.tar.xz"
[D] 13:07:11.26 0x7f9335667d00 () - xz decoded, stats: size= 6720012 , duration= 391 , threads= 6
[D] 13:07:11.26 0x7f9335667d00 () - extracting archive: "/home/user/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/espeakdata.tar"
[D] 13:07:11.79 0x7f9335667d00 () - module successfully unpacked: "espeakdata"
[D] 13:07:11.82 0x7f9335667d00 () - module already unpacked: "espeakdata"
[D] 13:07:11.92 0x7f9335667d00 () - default stt model not found: "en"
[D] 13:07:11.92 0x7f9335667d00 () - default tts model not found: "en"
[D] 13:07:11.92 0x7f9335667d00 () - default mnt lang not found: "en"
[D] 13:07:11.92 0x7f9335667d00 () - new default mnt lang: "en"
[D] 13:07:11.92 0x7f9335667d00 () - service refresh status, new state: busy
[D] 13:07:11.92 0x7f9335667d00 () - service state changed: unknown => busy
[D] 13:07:11.93 0x7f9335667d00 () - features availability ready
[W] 13:07:11.93 0x7f9335667d00 has_lib:477 - failed to open libcudart.so: libcudart.so: cannot open shared object file: No such file or directory
[W] 13:07:11.93 0x7f9335667d00 has_lib:477 - failed to open libcudnn.so: libcudnn.so: cannot open shared object file: No such file or directory
[W] 13:07:11.93 0x7f9335667d00 has_lib:477 - failed to open libcudnn.so.8: libcudnn.so.8: cannot open shared object file: No such file or directory
sh: line 1: perl: command not found
[W] 13:07:11.106 0x7f9335667d00 has_cuda:56 - failed to open whisper-cublas lib: libwhisper-cublas.so: cannot open shared object file: No such file or directory
[W] 13:07:11.106 0x7f9335667d00 has_hip:80 - failed to open whisper-hipblas lib: libwhisper-hipblas.so: cannot open shared object file: No such file or directory
[D] 13:07:11.126 0x7f9335667d00 () - updating model using availability
[D] 13:07:11.126 0x7f9335667d00 () - updating model using availability internal
[D] 13:07:11.127 0x7f9335667d00 () - default stt model not found: "en"
[D] 13:07:11.127 0x7f9335667d00 () - default tts model not found: "en"
[D] 13:07:11.127 0x7f9335667d00 () - default mnt lang not found: "en"
[D] 13:07:11.127 0x7f9335667d00 () - new default mnt lang: "en"
[D] 13:07:11.127 0x7f9335667d00 () - service refresh status, new state: not-configured
[D] 13:07:11.127 0x7f9335667d00 () - service state changed: busy => not-configured
[D] 13:07:11.127 0x7f9335667d00 () - scan cuda: true
[D] 13:07:11.127 0x7f9335667d00 () - scan hip: true
[D] 13:07:11.127 0x7f9335667d00 () - scan opencl: true false
[D] 13:07:11.127 0x7f9335667d00 add_cuda_devices:281 - scanning for cuda devices
[W] 13:07:11.127 0x7f9335667d00 cuda_api:168 - failed to open cudart lib: libcudart.so: cannot open shared object file: No such file or directory
[D] 13:07:11.128 0x7f9335667d00 add_hip_devices:318 - scanning for hip devices
[W] 13:07:11.128 0x7f9335667d00 hip_api:223 - failed to open hip lib: libamdhip64.so: cannot open shared object file: No such file or directory
[D] 13:07:11.128 0x7f9335667d00 add_opencl_devices:357 - scanning for opencl devices

I use intel oot i915 because upstream kernel does not have complete support for intel Xe GPU dg1, upstream kernel works on X11 but hardware acceleration is missing. Originally I did this for ffmpeg qsv hardware encoding, ffmpeg and jellyfin confirmed working.

Only Speech Note flatpak or any flatpak app?

This is what I meant by saying missing mesa GL related library files.

Jellyfin somehow as you see added a line DG1 support in their flatpak, so flatpak Jellyfin works. I tried flatpak obs studio also, it showed the same exit without GUI showing up thing as dsnote. I guess because both obs studio and dsnote at the moment rely on standard flatpak runtime which does not have patched GL library files.

For example, intel maintains a page where you can see they compiled special mesa GL related packages,

https://dgpu-docs.intel.com/releases/LTS_803.45_20240426.html

this package is crucial libgl1-mesa-dri, couple months ago, when I played with Debian not wanting to switch to Ubuntu, I just installed this package and pulled related deps then DG1 hardware encoding on Debian became working.

Edit: Another path would be skipping i915 and oot driver all together and rely on standard runtime. Since 6.8, linux kernel has introduced a new Xe driver for intel GPUs and currently Debian has 6.8.x in unstable branch, I have not tried unstable debian kernel yet though.

https://www.kernel.org/doc/html//next/gpu/rfc/xe.html

h9j6k commented 4 months ago

whisper.cpp can use OpenVino though, which can use Intel's GPUs

Have you tried using whisper.cpp with OpenVino? It is working? Do you need any special configuration on your system to make it work?

I googled the openvino + intel gpu and found these, would these help with dsnote integration??

https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Accelerate-Workloads-with-OpenVINO-and-OneDNN/post/1503867 https://blog.openvino.ai/blog-posts/techniques-for-faster-ai-inference-throughput-with-openvino-on-intel-gpus https://blog.openvino.ai/blog-posts/optimizing-whisper-and-distil-whisper-for-speech-recognition-with-openvino-and-nncf https://huggingface.co/Intel/whisper.cpp-openvino-models/tree/main

Intel's original models are huge, and someone did some work to trim them down a little bit, (still huge)

https://github.com/zhuzilin/whisper-openvino https://huggingface.co/twdragon/whisper.cpp-openvino/tree/main

barolo commented 4 months ago

whisper.cpp can use OpenVino though, which can use Intel's GPUs

Have you tried using whisper.cpp with OpenVino? It is working? Do you need any special configuration on your system to make it work?

Yes, I have it working. I'm using it to live-transcribe podcasts in MPV player amongst other things, using GPU. it's much more efficient. It was some time ago since I've had it set-up, I had to compile Whisper with support for OV. Then there's command line switch to pick device, and that was it.

mkiol commented 4 months ago

Yes, I have it working.

Thanks for the confirmation. I will try to add support to OpenVINO.

Since whisper.cpp is already in the Speech Note in 4 different versions (OpenBLAS, OpenBLAS+AVX, CUDA and hipBLAS), so I don't expect that adding another one would be very changeling 😄

The only potential problems I see is Flatpak sandboxing. That needs to be verified, but potentially OpenVINO might just don't work due to isolation that Flatpak provides.

mkiol commented 3 months ago

Hey, letting know about related changes added in version 4.6.0 Beta 1:

image

barolo commented 3 months ago
  • I don't have a system with a discrete Intel GPU, so I haven't been able to test this, but with an integrated Intel GPU the speed results are similar to the Intel CPU or worse. It seems that the iGPU does not provide anything of value for STT when using OpenVINO. At least these are my observations

The value of using GPU is to make use of it and to offload CPU, I don't care about the speed in such case, that's not the point. It should be more energy efficient too, which matters on mobile devices.

mkiol commented 3 months ago

The value of using GPU is to make use of it and to offload CPU

Fair point. Indeed, it may be an advantage.

mkiol commented 1 month ago

Just letting know that v4.6.0 is out with "OpenVINO CPU-only" support. I will try to add GPU support in the next version(s).

The new version should be available in Flathub tomorrow.