mkiol / dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Mozilla Public License 2.0
404 stars 19 forks source link

Crashes when clicking listen with any whisper model #133

Closed 31337-4554551n closed 1 month ago

31337-4554551n commented 2 months ago

When downloading and trying to use any of the whisper models, and clicking listen the app crashes. Deepspeech, april-asr, and vosk variations run. I would really appreciate the help, as the whisper models are the ones with high quality and punctuation support.

Speechnote 4.4.0 flatpak OS: Pop!_OS 22.04 LTS x86_64 DE: GNOME 42.9 GPU: AMD ATI 76:00.0 Rembrandt GPU: AMD ATI Radeon RX 6700/6700 XT

31337-4554551n commented 2 months ago

ggml_opencl: selecting platform: 'Clover' ggml_opencl: selecting device: 'AMD Radeon RX 6700 XT (radeonsi, navi22, LLVM 15.0.7, DRM 3.57, 6.8.0-76060800daily20240311-generic)' ggml_opencl: device FP16 support: false ggml_opencl: kernel compile error:

fatal error: cannot open file '/usr/lib/x86_64-linux-gnu/GL/default/share/clc/gfx1031-amdgcn-mesa-mesa3d.bc': No such file or directory

[W] 01:39:57.399 0x79bffa200600 () - QObject::killTimer: Timers cannot be stopped from another thread [W] 01:39:57.399 0x79bffa200600 () - QObject::~QObject: Timers cannot be stopped from another thread [W] 01:39:57.400 0x79bffa200600 () - QObject::~QObject: Timers cannot be stopped from another thread [W] 01:39:57.400 0x79bffa200600 () - QHotkeyPrivate destroyed with registered shortcuts! [D] 01:39:58.5 0x79c1f3c60d00 () - using audio input: "alsa_input.usb-046d_C922_Pro_Stream_Webcam_E37978DF-02.iec958-stereo" [D] 01:39:58.91 0x79c1f3c60d00 () - audio state: IdleState [D] 01:39:58.91 0x79c1f3c60d00 set_speech_started:515 - speech started: false => true [D] 01:39:58.91 0x79c1f3c60d00 () - service refresh status, new state: listening-auto [D] 01:39:58.91 0x79c1f3c60d00 () - service state changed: idle => listening-auto [W] 01:39:58.91 0x79c1f3c60d00 () - ignore TaskStatePropertyChanged signal [D] 01:39:58.91 0x79c1f3c60d00 () - app current task: -1 => 0 [D] 01:39:58.91 0x79c1f3c60d00 () - app speech state: idle => initializing Segmentation fault (core dumped) [📦 net.mkiol.SpeechNote ~]$

mkiol commented 2 months ago

Hi. Thanks for the report.

The app crashes because GPU acceleration with OpenCL (Clover implementation) doesn't work with your graphics card. This is absolutely not a surprise because only few older AMD cards support it.

If you want to use GPU acceleration with Whisper, install "Speech Note AMD" add-on which enables direct AMD ROCm support and better OpenCL implementation. If you don't want to use add-on, you can just disable GPU acceleration in the settings ("Speech to text" tab) and use Whisper models only with CPU.

To install AMD add-on, use the following command:

flatpak install net.mkiol.SpeechNote.Addon.amd
31337-4554551n commented 2 months ago

Hi, thank you got getting back to me.

This addon was installed from the start, the issue persists.

Speech Note net.mkiol.SpeechNote 4.4.0 stable user Speech Note AMD net.mkiol.SpeechNote.Addon.amd 1.0.0 stable user

mkiol commented 2 months ago

Could you please paste here full log from app start?

flatpak run net.mkiol.SpeechNote --verbose

In the setting, "Speech to Text" tab, you should be able to see "Graphics card" combo-box. What options do you see? Try to set "ROCm" or "OpenCL", but not "OpenCL, Clover".

31337-4554551n commented 2 months ago

Screenshot-20240517235724-1237x176 I see this, it's been set to Auto this whole time.

31337-4554551n commented 2 months ago
Gtk-Message: 23:59:52.880: Failed to load module "canberra-gtk-module"
Gtk-Message: 23:59:52.881: Failed to load module "canberra-gtk-module"
qt.qpa.qgnomeplatform: Could not find color scheme  ""
Qt: Session management error: Could not open network socket
[I] 23:59:52.911 0x77cbb2751d00 init:49 - logging to stderr enabled
[D] 23:59:52.911 0x77cbb2751d00 () - version: 4.4.0
[D] 23:59:52.912 0x77cbb2751d00 parse_cpuinfo:117 - cpu flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap
[D] 23:59:52.912 0x77cbb2751d00 parse_cpuinfo:125 - cpuinfo: processor-count=16, flags=[avx, avx2, fma, f16c, ]
[D] 23:59:52.912 0x77cbb2751d00 () - failed to load translation: "C" ":/translations"
[W] 23:59:52.912 0x77cbb2751d00 () - failed to install translation
[D] 23:59:52.912 0x77cbb2751d00 () - starting standalone app
[D] 23:59:52.914 0x77cbb2751d00 () - app: net.mkiol dsnote
[D] 23:59:52.914 0x77cbb2751d00 () - config location: "[redacted].var/app/net.mkiol.SpeechNote/config"
[D] 23:59:52.914 0x77cbb2751d00 () - data location: "[redacted].var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote"
[D] 23:59:52.914 0x77cbb2751d00 () - cache location: "[redacted].var/app/net.mkiol.SpeechNote/cache/net.mkiol/dsnote"
[D] 23:59:52.914 0x77cbb2751d00 () - settings file: "[redacted].var/app/net.mkiol.SpeechNote/config/net.mkiol/dsnote/settings.conf"
[D] 23:59:52.914 0x77cbb2751d00 () - platform: "xcb"
[D] 23:59:52.914 0x77cbb2751d00 () - amd addon exists
[D] 23:59:52.914 0x77cbb2751d00 () - enforcing num threads: 0
[D] 23:59:53.387 0x77cbb2751d00 () - supported audio input devices:
ALSA lib ../../oss/pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
[D] 23:59:53.401 0x77cbb2751d00 () - "pulse"
[D] 23:59:53.481 0x77cbb2751d00 () - "default"
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
ALSA lib ../../../src/pcm/pcm_direct.c:2045:(snd1_pcm_direct_parse_open_conf) The field ipc_gid must be a valid group (create group audio)
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_input.usb-046d_C922_Pro_Stream_Webcam_E37978DF-02.iec958-stereo"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_output.usb-Kingston_HyperX_Virtual_Surround_Sound_00000000-00.iec958-stereo.monitor"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_input.usb-Kingston_HyperX_Virtual_Surround_Sound_00000000-00.iec958-stereo"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_output.pci-0000_76_00.6.analog-stereo.monitor"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_input.pci-0000_76_00.6.analog-stereo"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_output.usb-0d8c_USB_Sound_Device-00.analog-surround-21.monitor"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_input.usb-0d8c_USB_Sound_Device-00.iec958-stereo"
[D] 23:59:53.507 0x77cbb2751d00 () - "alsa_output.pci-0000_03_00.1.hdmi-stereo.monitor"
[D] 23:59:53.526 0x77cbb2751d00 () - starting service: app-standalone
[D] 23:59:53.528 0x77cbb2751d00 () - mbrola dir: "/app/bin"
[D] 23:59:53.528 0x77cbb2751d00 () - espeak dir: "/app/bin"
[D] 23:59:53.528 0x77cba6a00600 loop:75 - py executor loop started
[D] 23:59:53.533 0x77cbb2751d00 () - module already unpacked: "rhvoicedata"
[D] 23:59:53.533 0x77cbb2751d00 () - module already unpacked: "rhvoiceconfig"
[D] 23:59:53.538 0x77cbb2751d00 () - module already unpacked: "espeakdata"
[D] 23:59:53.538 0x77cbb2751d00 () - default stt model not found: "en_whisper_tiny"
[D] 23:59:53.538 0x77cbb2751d00 () - default tts model not found: "en_piper_us_ryan_high"
[D] 23:59:53.538 0x77cbb2751d00 () - default mnt lang not found: "en"
[D] 23:59:53.538 0x77cbb2751d00 () - new default mnt lang: "en"
[D] 23:59:53.538 0x77cbb2751d00 () - service refresh status, new state: busy
[D] 23:59:53.538 0x77cbb2751d00 () - service state changed: unknown => busy
[D] 23:59:53.538 0x77cbb2751d00 () - delaying features availability
[D] 23:59:53.541 0x77cbb2751d00 () - runtime prefix: "/app"
[D] 23:59:53.541 0x77cbb2751d00 () - available styles: ("Default", "Fusion", "Imagine", "Material", "org.kde.breeze", "org.kde.desktop", "Plasma", "Universal")
[D] 23:59:53.541 0x77cbb2751d00 () - style paths: ("/usr/lib/qml/QtQuick/Controls.2")
[D] 23:59:53.541 0x77cbb2751d00 () - import paths: ("/usr/lib/qml", "/app/bin", "qrc:/qt-project.org/imports")
[D] 23:59:53.541 0x77cbb2751d00 () - library paths: ("/usr/share/runtime/lib/plugins", "/usr/lib/plugins", "/app/bin")
[D] 23:59:53.541 0x77cba7400600 () - config version: 65 65
[D] 23:59:53.541 0x77cbb2751d00 () - switching to style: "Fusion"
[W] 23:59:53.543 0x77cba7400600 () - checksum mismatch: "edb3bae5" (expected: "9876206e" ) "en_fasterwhisper_distil_large2"
[D] 23:59:53.544 0x77cba6a00600 libs_availability:61 - checking: torch cuda
[D] 23:59:53.579 0x77cba7400600 () - models changed
[D] 23:59:54.247 0x77cbb2751d00 () - starting app: app-standalone
[D] 23:59:54.248 0x77cbb2751d00 () - app service state: unknown => busy
logger error: invalid format string
qrc:/qml/main.qml:340:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this syntax instead: function onFoo(<arguments>) { ... }
logger error: invalid format string
qrc:/qml/main.qml:331:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this syntax instead: function onFoo(<arguments>) { ... }
logger error: invalid format string
qrc:/qml/Notepad.qml:24:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this syntax instead: function onFoo(<arguments>) { ... }
logger error: invalid format string
qrc:/qml/Translator.qml:29:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this syntax instead: function onFoo(<arguments>) { ... }
logger error: invalid format string
qrc:/qml/MainToolBar.qml:282:13: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this syntax instead: function onFoo(<arguments>) { ... }
[D] 23:59:54.431 0x77cbb2751d00 onCompleted:180 - default font pixel size: 13
[D] 23:59:54.444 0x77cbb2751d00 () - service refresh status, new state: busy
[D] 23:59:54.444 0x77cbb2751d00 () - service refresh status, new state: busy
[D] 23:59:54.451 0x77cbb2751d00 () - stt models changed
[D] 23:59:54.451 0x77cbb2751d00 () - update listen
[D] 23:59:54.451 0x77cbb2751d00 () - app stt configured: false => true
[D] 23:59:54.453 0x77cbb2751d00 () - app active stt model: "" => "en_whisper_tiny"
[D] 23:59:54.453 0x77cbb2751d00 () - update listen
[D] 23:59:54.453 0x77cbb2751d00 () - tts models changed
[D] 23:59:54.453 0x77cbb2751d00 () - update listen
[D] 23:59:54.453 0x77cbb2751d00 () - app tts configured: false => true
[D] 23:59:54.453 0x77cbb2751d00 () - app active tts model: "" => "en_piper_us_ryan_high"
[D] 23:59:54.453 0x77cbb2751d00 () - update listen
[W] 23:59:54.453 0x77cbb2751d00 () - no available tts models for in mnt
[W] 23:59:54.453 0x77cbb2751d00 () - no available tts models for out mnt
[D] 23:59:54.453 0x77cbb2751d00 () - ttt models changed
[D] 23:59:54.453 0x77cbb2751d00 () - app ttt configured: false => true
[D] 23:59:54.459 0x77cbb2751d00 () - mnt langs changed
[D] 23:59:54.459 0x77cbb2751d00 () - update listen
[D] 23:59:54.459 0x77cbb2751d00 () - app mnt configured: false => true
[D] 23:59:54.460 0x77cbb2751d00 () - app active mnt lang: "" => "en"
[D] 23:59:54.460 0x77cbb2751d00 () - app mnt available out langs: 0 => 1
[D] 23:59:54.460 0x77cbb2751d00 () - app tts available models for in mnt: 0 => 3
[D] 23:59:54.460 0x77cbb2751d00 () - app active tts model for in mnt: "" => "en_piper_us_ryan_high"
[D] 23:59:54.460 0x77cbb2751d00 () - app active mnt out lang: "" => "ru"
[W] 23:59:54.460 0x77cbb2751d00 () - no available tts models for out mnt
[D] 23:59:54.643 0x77cbb2751d00 () - trying features availability update: false
[D] 23:59:55.194 0x77cba6a00600 libs_availability:69 - checking: coqui tts
[D] 23:59:55.194 0x77cba6a00600 libs_availability:77 - checking: faster-whisper
[D] 23:59:55.641 0x77cbb2751d00 () - trying features availability update: false
[D] 23:59:55.648 0x77cba6a00600 libs_availability:85 - checking: transformers
[D] 23:59:55.648 0x77cba6a00600 libs_availability:87 - checking: accelerate
[D] 23:59:56.122 0x77cba6a00600 libs_availability:95 - checking: unikud
[D] 23:59:56.123 0x77cba6a00600 libs_availability:106 - checking: mimic3 tts
[D] 23:59:56.641 0x77cbb2751d00 () - trying features availability update: false
[D] 23:59:56.800 0x77cba6a00600 libs_availability:114 - checking: gruut
[D] 23:59:56.800 0x77cba6a00600 libs_availability:118 - checking: gruut-de
[D] 23:59:56.800 0x77cba6a00600 libs_availability:126 - checking: gruut-es
[D] 23:59:56.800 0x77cba6a00600 libs_availability:134 - checking: gruut-fr
[D] 23:59:56.801 0x77cba6a00600 libs_availability:142 - checking: gruut-it
[D] 23:59:56.801 0x77cba6a00600 libs_availability:150 - checking: gruut-ru
[D] 23:59:56.801 0x77cba6a00600 libs_availability:158 - checking: gruut-fa
[D] 23:59:56.801 0x77cba6a00600 libs_availability:166 - checking: gruut-sw
[D] 23:59:56.801 0x77cba6a00600 libs_availability:174 - checking: gruut-nl
[D] 23:59:56.802 0x77cba6a00600 libs_availability:185 - checking: mecab
[D] 23:59:56.804 0x77cba6a00600 libs_availability:187 - checking: unidic-lite
[D] 23:59:56.805 0x77cba6a00600 libs_availability:194 - py libs availability: [coqui-tts=true, faster-whisper=true, mimic3-tts=true, transformers=true, unikud=true, gruut_de=true, gruut_es=true, gruut_fa=true, gruut_fr=true, gruut_nl=true, gruut_it=true, gruut_ru=true, gruut_sw=true, mecab=true, torch-cuda=false]
[D] 23:59:57.642 0x77cbb2751d00 () - trying features availability update: true
[D] 23:59:57.642 0x77cbb2751d00 () - features availability ready
[W] 23:59:57.643 0x77cbb2751d00 has_lib:477 - failed to open libcudart.so: libcudart.so: cannot open shared object file: No such file or directory
[W] 23:59:57.643 0x77cbb2751d00 has_lib:477 - failed to open libcudnn.so: libcudnn.so: cannot open shared object file: No such file or directory
[W] 23:59:57.643 0x77cbb2751d00 has_lib:477 - failed to open libcudnn.so.8: libcudnn.so.8: cannot open shared object file: No such file or directory
[W] 23:59:57.671 0x77cbb2751d00 has_cuda:56 - failed to open whisper-cublas lib: libwhisper-cublas.so: cannot open shared object file: No such file or directory
[D] 23:59:57.679 0x77cbb2751d00 () - updating model using availability
[D] 23:59:57.679 0x77cbb2751d00 () - updating model using availability internal
[D] 23:59:57.681 0x77cbb2751d00 () - service refresh status, new state: idle
[D] 23:59:57.681 0x77cbb2751d00 () - service state changed: busy => idle
[D] 23:59:57.681 0x77cbb2751d00 () - scan cuda: false
[D] 23:59:57.681 0x77cbb2751d00 () - scan hip: true
[D] 23:59:57.681 0x77cbb2751d00 () - scan opencl: true false
[D] 23:59:57.681 0x77cbb2751d00 add_hip_devices:318 - scanning for hip devices
[D] 23:59:57.685 0x77cbb2751d00 add_hip_devices:327 - hip version: driver=50631062, runtime=50631062
[W] 23:59:57.685 0x77cbb2751d00 add_hip_devices:332 - hipGetDeviceCount returned: 100
[D] 23:59:57.686 0x77cbb2751d00 add_opencl_devices:357 - scanning for opencl devices
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:374 - opencl number of platforms: 2
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:399 - opencl platform: 0, name=Clover, vendor=Mesa
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:413 - opencl number of devices: 2
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:437 - opencl device: 0, platform name=Clover, device name=AMD Radeon RX 6700 XT (radeonsi, navi22, LLVM 15.0.7, DRM 3.57, 6.8.0-76060800daily20240311-generic), types=[GPU, ]
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:437 - opencl device: 1, platform name=Clover, device name=AMD Radeon Graphics (radeonsi, rembrandt, LLVM 15.0.7, DRM 3.57, 6.8.0-76060800daily20240311-generic), types=[GPU, ]
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:399 - opencl platform: 1, name=AMD Accelerated Parallel Processing, vendor=Advanced Micro Devices, Inc.
[D] 23:59:57.722 0x77cbb2751d00 add_opencl_devices:413 - opencl number of devices: 0
[D] 23:59:57.722 0x77cbb2751d00 () - service refresh status, new state: idle
[D] 23:59:57.722 0x77cbb2751d00 () - app service state: busy => idle
[W] 23:59:57.727 0x77cbb2751d00 () - no available tts models for out mnt
[W] 23:59:57.727 0x77cbb2751d00 () - invalid task, reseting task state
[D] 23:59:57.727 0x77cbb2751d00 () - app busy: true => false
[D] 23:59:57.727 0x77cbb2751d00 () - stt models changed
[D] 23:59:57.728 0x77cbb2751d00 () - update listen
[D] 23:59:57.728 0x77cbb2751d00 () - tts models changed
[D] 23:59:57.728 0x77cbb2751d00 () - update listen
[W] 23:59:57.728 0x77cbb2751d00 () - no available tts models for out mnt
[D] 23:59:57.728 0x77cbb2751d00 () - ttt models changed
[D] 23:59:57.731 0x77cbb2751d00 () - mnt langs changed
[D] 23:59:57.731 0x77cbb2751d00 () - update listen
[D] 00:00:00.736 0x77cbb2751d00 () - stt start listen
[D] 00:00:00.737 0x77cbb2751d00 () - choosing model for id: "en_whisper_tiny" "en"
[D] 00:00:00.737 0x77cbb2751d00 () - found ttt model for stt: "en_hftc_kredor"
[D] 00:00:00.737 0x77cbb2751d00 () - gpu device str: ("OpenCL", " Clover", " AMD Radeon RX 6700 XT (radeonsi", " navi22", " LLVM 15.0.7", " DRM 3.57", " 6.8.0-76060800daily20240311-generic)")
[D] 00:00:00.737 0x77cbb2751d00 () - restart stt engine config: "lang=en, lang_code=, model-files=[model-file=[redacted].var/app/net.mkiol.SpeechNote/cache/net.mkiol/dsnote/speech-models/en_whisper_tiny.ggml, scorer-file=, ttt-model-file=[redacted].var/app/net.mkiol.SpeechNote/cache/net.mkiol/dsnote/speech-models/multilang_hftc_kredor], speech-mode=automatic, vad-mode=aggressiveness-3, speech-started=0, text-format=raw, options=, use-gpu=1, gpu-device=[id=0, api=opencl, name=AMD Radeon RX 6700 XT (radeonsi, platform-name=Clover], sub-config=[min-segment-dur=4, min-line-length=0, max-line-length=0]"
[D] 00:00:00.737 0x77cbb2751d00 () - new stt engine required
[D] 00:00:00.737 0x77cbb2751d00 open_whisper_lib:135 - using whisper-clblast
[D] 00:00:00.738 0x77cbb2751d00 make_wparams:429 - cpu info: arch=x86_64, cores=16
[D] 00:00:00.738 0x77cbb2751d00 make_wparams:431 - using threads: 5/16
[D] 00:00:00.738 0x77cbb2751d00 make_wparams:433 - system info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | 
[D] 00:00:00.738 0x77cbb2751d00 start:225 - starting engine
[D] 00:00:00.738 0x77cbb2751d00 start:234 - engine started
[D] 00:00:00.738 0x77cbb2751d00 () - creating audio source
[D] 00:00:00.738 0x77cbb2751d00 () - mic source created
[D] 00:00:00.739 0x77c9b1600600 start_processing:271 - processing started
[D] 00:00:00.739 0x77c9b1600600 set_processing_state:457 - processing state: idle => initializing
[D] 00:00:00.739 0x77c9b1600600 set_processing_state:464 - speech detection status: no-speech => initializing (no-speech)
[D] 00:00:00.739 0x77c9b1600600 () - service refresh status, new state: idle
[D] 00:00:00.739 0x77c9b1600600 () - task state changed: 0 => 3
[D] 00:00:00.739 0x77c9b1600600 create_whisper_model:239 - creating whisper model
whisper_init_from_file_with_params_no_state: loading model from '[redacted].var/app/net.mkiol.SpeechNote/cache/net.mkiol/dsnote/speech-models/en_whisper_tiny.ggml'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 9
whisper_model_load: qntvr         = 2
whisper_model_load: type          = 1 (tiny)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
ggml_opencl: selecting platform: 'Clover'
ggml_opencl: selecting device: 'AMD Radeon RX 6700 XT (radeonsi, navi22, LLVM 15.0.7, DRM 3.57, 6.8.0-76060800daily20240311-generic)'
ggml_opencl: device FP16 support: false
ggml_opencl: kernel compile error:

fatal error: cannot open file '/usr/lib/x86_64-linux-gnu/GL/default/share/clc/gfx1031-amdgcn-mesa-mesa3d.bc': No such file or directory

[W] 00:00:00.780 0x77c9b1600600 () - QObject::killTimer: Timers cannot be stopped from another thread
[W] 00:00:00.780 0x77c9b1600600 () - QObject::~QObject: Timers cannot be stopped from another thread
[W] 00:00:00.780 0x77c9b1600600 () - QObject::~QObject: Timers cannot be stopped from another thread
[W] 00:00:00.780 0x77c9b1600600 () - QHotkeyPrivate destroyed with registered shortcuts!
[D] 00:00:01.247 0x77cbb2751d00 () - using audio input: "alsa_input.usb-046d_C922_Pro_Stream_Webcam_E37978DF-02.iec958-stereo"
[D] 00:00:01.402 0x77cbb2751d00 () - audio state: IdleState
[D] 00:00:01.402 0x77cbb2751d00 set_speech_started:515 - speech started: false => true
[D] 00:00:01.402 0x77cbb2751d00 () - service refresh status, new state: listening-auto
[D] 00:00:01.402 0x77cbb2751d00 () - service state changed: idle => listening-auto
[W] 00:00:01.402 0x77cbb2751d00 () - ignore TaskStatePropertyChanged signal
[D] 00:00:01.402 0x77cbb2751d00 () - app current task: -1 => 0
[D] 00:00:01.402 0x77cbb2751d00 () - app speech state: idle => initializing
mkiol commented 2 months ago

Thanks for the log.

[D] 23:59:57.681 0x77cbb2751d00 add_hip_devices:318 - scanning for hip devices
[D] 23:59:57.685 0x77cbb2751d00 add_hip_devices:327 - hip version: driver=50631062, runtime=50631062
[W] 23:59:57.685 0x77cbb2751d00 add_hip_devices:332 - hipGetDeviceCount returned: 100

I don't know why but ROCm runtime report an error. This 100 error means "No device detected".

I'm planing to push new release today (v4.5.0). Updated version comes with newer ROCm that potentially may resolve the problem.

I will let you know when new version is available on Flathub.

31337-4554551n commented 2 months ago

Hi, I downloaded your latest image. The models no longer crash, but it's now worse as it's not detecting the GPU at all image In contrast to my last screenshot where you could see the GPU being detected.

Unfortunately this makes it a lot less usable with the crazy new processing times :(

31337-4554551n commented 2 months ago

Hi, thank you got getting back to me.

This addon was installed from the start, the issue persists.

Speech Note net.mkiol.SpeechNote 4.4.0 stable user Speech Note AMD net.mkiol.SpeechNote.Addon.amd 1.0.0 stable user

It may be addon related. This is what I had previously, now I have Speech Note net.mkiol.SpeechNote 4.5.0 stable user amd net.mkiol.SpeechNote.Addon.amd stable user

Called amd now, as opposed to Speech Note AMD, no version number, and I can no longer find it on the pop playstore (which searches flathub I believe)

Flathub/pop store lists jupil as another app by you, but no mention of the addons. Have they somehow dropped off flathub?

flatpak remove net.mkiol.SpeechNote.Addon.amd flatpak install net.mkiol.SpeechNote.Addon.amd

Works-even though flathub search speechnote Only finds the primary application.

Reinstalling gives the same result. Shows up as amd, no version, doesn't work.

mkiol commented 2 months ago

Flathub/pop store lists jupil as another app by you, but no mention of the addons. Have they somehow dropped off flathub?

OMG!!! I messed up something with the new update. Add-ons are no longer discoverable. Now I have to fix it.

Thank you for noticing this.

mkiol commented 2 months ago

Issue for tracking "missing add-ons": https://github.com/mkiol/dsnote/issues/139

31337-4554551n commented 2 months ago

Thank you, are the issues of them not being found, and not being usable by the main software related?

mkiol commented 2 months ago

Hi, I downloaded your latest image. The models no longer crash, but it's now worse as it's not detecting the GPU at all image. In contrast to my last screenshot where you could see the GPU being detected. Unfortunately this makes it a lot less usable with the crazy new processing times :(

It is actually intended. New version has this Clover OpenCL disabled by default and this is a good thing because Speech Note doesn't crash. Unfortunately the initial problem i.e. "AMD ROCm doesn't detect you GPU" still exists and I don't have any idea how to resolve it :(

Thank you, are the issues of them not being found

The problem is only in discoverability via flathub webpage and flatpak search tool. Add-on exists and works as intended but you have to install it manually with flatpak install net.mkiol.SpeechNote.Addon.amd.

31337-4554551n commented 2 months ago

OK,

So it used to crash for some models, but works for others. How can I enable Clover so I can keep using the models that work with the GPU, with the GPU?

On Tue, 21 May 2024, 04:47 mkiol, @.***> wrote:

Hi, I downloaded your latest image. The models no longer crash, but it's now worse as it's not detecting the GPU at all image. In contrast to my last screenshot where you could see the GPU being detected. Unfortunately this makes it a lot less usable with the crazy new processing times :(

It is actually intended. New version has this Clover OpenCL disabled by default and this is a good thing because Speech Note doesn't crash. Unfortunately the initial problem i.e. "AMD ROCm doesn't detect you GPU" still exists and I don't have any idea how to resolve it :(

Thank you, are the issues of them not being found

The problem is only in discoverability via flathub webpage and flatpak search tool. Add-on exists and works as intended but you have to install it manually with flatpak install net.mkiol.SpeechNote.Addon.amd.

— Reply to this email directly, view it on GitHub https://github.com/mkiol/dsnote/issues/133#issuecomment-2121009282, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJS5AVKMM7FUH5DJ7DZNM3ZDJAMFAVCNFSM6AAAAABHYIPBN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRGAYDSMRYGI . You are receiving this because you authored the thread.Message ID: @.***>

mkiol commented 2 months ago

Oh, I did catch that it works for some models.

To enable use "Use OpenCL (Clover)" in the settings ("Other" tab)

image

31337-4554551n commented 2 months ago

Thank you, I'll try that after work.

So what is the solution to getting the whisper models/opencl working?

On Tue, 21 May 2024, 14:56 mkiol, @.***> wrote:

Oh, I did catch that it works for some models.

To enable use "Use OpenCL (Clover)" in the settings ("Other" tab)

image.png (view on web) https://github.com/mkiol/dsnote/assets/1420902/7d7d4f94-788b-4b93-842b-bcc162ad3069

— Reply to this email directly, view it on GitHub https://github.com/mkiol/dsnote/issues/133#issuecomment-2121740018, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJS5AQNTAVUYEQG7JFENTTZDLHXFAVCNFSM6AAAAABHYIPBN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRG42DAMBRHA . You are receiving this because you authored the thread.Message ID: @.***>

31337-4554551n commented 2 months ago

Thank you.

That takes things back to how they were originally. Everything works except for the whisper models, which crash the program

mkiol commented 2 months ago

Unfortunately. I don't think I can do anything more to attack this problem :/

It looks like your GPU is not supported in ROCm. In fact, this is not too surprising, as AMD ROCm officially supports only a few graphics cards. AMD is mainly focused on the enterprise market in their machine learning offerings. They don't care about consumer hardware when it comes to ML. This is my personal opinion.

31337-4554551n commented 2 months ago

Thank you. Could you please explain why other models work fine, but whisper ones don't? I use rocm alternatives I guess to get non whisper models working, how does that work?

On Thu, 23 May 2024, 04:02 mkiol, @.***> wrote:

Unfortunately. I don't think I can do anything more to attack this problem :/

It looks like your GPU is not supported in ROCm. In fact, this is not too surprising, as AMD ROCm officially supports only a few graphics cards. AMD is mainly focused on the enterprise market in their machine learning offerings. They don't care about consumer hardware when it comes to ML. This is my personal opinion.

— Reply to this email directly, view it on GitHub https://github.com/mkiol/dsnote/issues/133#issuecomment-2125438692, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJS5ARTTVACYMD3FJTRHITZDTMT5AVCNFSM6AAAAABHYIPBN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRVGQZTQNRZGI . You are receiving this because you authored the thread.Message ID: @.***>

mkiol commented 1 month ago

Sorry for late reply.

Could you please explain why other models work fine, but whisper ones don't?

In Speech Note, GPU acceleration for AMD cards is implemented only for Whisper models (via whisper.cpp engine). Therefore, once it is enabled, only Whisper models are affected.

31337-4554551n commented 1 month ago

Right, so basically with an AMD card, I can't use GPU acceleration with speech note is what it comes down to? And if you enabled it for other models, it would just crash there too?

mkiol commented 1 month ago

Right, so basically with an AMD card, I can't use GPU acceleration with speech note is what it comes down to?

Unfortunately with enabled GPU acceleration you can't. You can still use Speech Note only with CPU. Majority of functionality doesn't use GPU acceleration.

And if you enabled it for other models, it would just crash there too?

For AMD, GPU acceleration speed-ups only “Whisper STT”, “Coqui TTS” and “WhisperSpeech TTS”. With GPU acceleration enabled, the application will most likely crash when using these models. When GPU acceleration is disabled, these models will work, but slowly.

31337-4554551n commented 1 month ago

Yea, that's unfortunate as whisper is the best model, but takes forever compared to others-presumably cos it's without a gpu.

Thank you so much for all your help, guess there's nothing else to be done here :)