intel / openvino-plugins-ai-audacity

A set of AI-enabled effects, generators, and analyzers for Audacity®.
GNU General Public License v3.0
912 stars 57 forks source link

Can't use whisper or separation efects #135

Closed mk-360 closed 5 months ago

mk-360 commented 5 months ago

Hi, First of all, excuse my bad English. I used openvino on the previous version of Audacity without trouble, but now I'm running 3.5.0 and it fails most of the time. Also I tried with a portable copy of the previous version and it failed too. Well, as a Demonstration, i loaded a wav file called "Sesión Nº3 prof. Carlos Pizarro - Zoom.wav" that is about 443 MB and a duration of 02:01:04.480. I select all, go to Analise > Whisper and select base, use CPU and spanish as source language. wen i click apply it show this message "Whisper Transcription failed. See details in Help->Diagnostics->Show Log..."

The log says: 23:24:46: Audacity 3.5.0 23:24:46: Error: Failed to load shared library 'avformat-60.dll' (error 126: No se puede encontrar el módulo especificado.) 23:24:46: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll 23:24:48: Error: Failed to load shared library 'avformat-60.dll' (error 126: No se puede encontrar el módulo especificado.) 23:24:48: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll 23:24:51: File name is D:\Documents\Derecho\Diplomado Construccion\Clase_No3_prof__Carlos_Pizarro\Sesión Nº3 prof. Carlos Pizarro - Zoom.wav 23:24:51: Mime type is * 23:24:51: Opening with libsndfile 23:24:51: Open(D:\Documents\Derecho\Diplomado Construccion\Clase_No3_prof__Carlos_Pizarro\Sesión Nº3 prof. Carlos Pizarro - Zoom.wav) succeeded 23:24:54: Operation 'Importando WAV (Microsoft)' took 2,128000 seconds. Poll was called 888 times and took 0,198368 seconds. Yield was called 32 times and took 0,034488 seconds. 23:24:54: Operation 'Recuperación de información musical' took 0,000000 seconds. Poll was called 0 times and took 0,000000 seconds. Yield was called 0 times and took 0,000000 seconds. 23:25:11: Operation 'Preprocesamiento' took 0,000000 seconds. Poll was called 0 times and took 0,000000 seconds. Yield was called 0 times and took 0,000000 seconds. 23:25:16: Error: In Whisper Transcription Effect, exception: whisper.cpp context creation / initialization failed 23:25:27: Operation 'OpenVINO Whisper Transcription' took 4,649000 seconds. Poll was called 3551 times and took 0,323108 seconds. Yield was called 62 times and took 0,055880 seconds. 23:25:16: Error: whisper_ctx_init_openvino_encoder failed for device = GPU

The device details are: CPU = Intel(R) Core(TM) i7-10700KF CPU @ 3.80GHz GPU = NVIDIA GeForce GTX 1650 SUPER (dGPU)

Note that if I try to use CPU it fails too. If you need more info ask me.

Giovani93 commented 5 months ago

I had a similar problem, but I have set devices to CPU and It works.

RyanMetcalfeInt8 commented 5 months ago

Hi @mk-360,

Can you post the failure log for when you have device set to CPU?

Thanks, Ryan

mk-360 commented 5 months ago

Yes, here is the log with the same file: 20:29:01: Audacity 3.5.0 20:29:01: Error: Failed to load shared library 'avformat-60.dll' (error 126: No se puede encontrar el módulo especificado.) 20:29:01: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll 20:29:03: Error: Failed to load shared library 'avformat-60.dll' (error 126: No se puede encontrar el módulo especificado.) 20:29:03: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll 20:29:03: File name is D:\Documents\Derecho\Diplomado Construccion\Clase_No3_prof__Carlos_Pizarro\3.wav 20:29:03: Mime type is * 20:29:03: Opening with libsndfile 20:29:03: Open(D:\Documents\Derecho\Diplomado Construccion\Clase_No3_prof__Carlos_Pizarro\3.wav) succeeded 20:29:06: Operation 'Importando WAV (Microsoft)' took 2,145000 seconds. Poll was called 888 times and took 0,242104 seconds. Yield was called 31 times and took 0,085677 seconds. 20:29:06: Operation 'Recuperación de información musical' took 0,000000 seconds. Poll was called 0 times and took 0,000000 seconds. Yield was called 0 times and took 0,000000 seconds. 20:29:37: Operation 'Preprocesamiento' took 0,000000 seconds. Poll was called 0 times and took 0,000000 seconds. Yield was called 0 times and took 0,000000 seconds. 20:29:41: Error: In Whisper Transcription Effect, exception: whisper.cpp context creation / initialization failed 20:29:43: Operation 'OpenVINO Whisper Transcription' took 4,268000 seconds. Poll was called 3550 times and took 0,366527 seconds. Yield was called 63 times and took 0,085294 seconds. 20:29:41: Error: whisper_init_from_file_with_params(C:\Program Files\Audacity\openvino-models\ggml-base.bin, ...) failed

mk-360 commented 5 months ago

Ok, now I tried with audacity 3.5.1 and the corresponding version of openvino and the problem persist. Here is the log, using CPU: 19:09:20: Audacity 3.5.1 19:09:20: Error: Failed to load shared library 'avformat-60.dll' (error 126: No se puede encontrar el módulo especificado.) 19:09:20: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll 19:09:24: Error: Failed to load shared library 'avformat-60.dll' (error 126: No se puede encontrar el módulo especificado.) 19:09:24: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll 19:09:52: File name is D:\Documents\Derecho\Diplomado Construccion\Clase_No3_prof__Carlos_Pizarro\3.wav 19:09:52: Mime type is * 19:09:52: Opening with libsndfile 19:09:52: Open(D:\Documents\Derecho\Diplomado Construccion\Clase_No3_prof__Carlos_Pizarro\3.wav) succeeded 19:09:55: Operation 'Importando WAV (Microsoft)' took 3,566000 seconds. Poll was called 888 times and took 0,273124 seconds. Yield was called 61 times and took 0,056741 seconds. 19:09:56: Operation 'Recuperación de información musical' took 0,000000 seconds. Poll was called 0 times and took 0,000000 seconds. Yield was called 0 times and took 0,000000 seconds. 19:18:15: Operation 'Preprocesamiento' took 0,000000 seconds. Poll was called 0 times and took 0,000000 seconds. Yield was called 0 times and took 0,000000 seconds. 19:18:21: Error: In Whisper Transcription Effect, exception: invalid map<K, T> key 19:18:25: Operation 'OpenVINO Whisper Transcription' took 6,412000 seconds. Poll was called 3554 times and took 0,403608 seconds. Yield was called 66 times and took 0,096114 seconds.

mk-360 commented 5 months ago

I don't know why the issue was closed, probably I pressed on an incorrect button... however, the problem continues, as I said on the previous comment.

RyanMetcalfeInt8 commented 5 months ago

Hi @mk-360,

Can you try the latest release?

https://github.com/intel/openvino-plugins-ai-audacity/releases/tag/v3.5.1-R2.1

This has a fix for the 19:18:21: Error: In Whisper Transcription Effect, exception: invalid map<K, T> key issue that you are experiencing.

Thanks, Ryan

mk-360 commented 5 months ago

Hi, I'm testing with the new release using GPU and although the process doesn't finishes yet, it is running without trouble. I'll test with CPU too and close this. Thanks for your help, OpenVino is really impressive.

mk-360 commented 5 months ago

Ok, all is working fine now, with CPU and GPU. Many thanks!

DataJuggler commented 2 months ago

I selected GPU and I got these errors today. Should I uninstall and reinstall and select CPU.

11:27:33: Audacity 3.5.1 11:27:33: Error: Failed to load shared library 'avformat-60.dll' (error 126: The specified module could not be found.) 11:27:33: Error: Failed to load shared library 'avformat-59.dll' (error 126: The specified module could not be found.) 11:27:33: Error: Failed to load shared library 'avformat-58.dll' (error 126: The specified module could not be found.) 11:27:33: Error: Failed to load shared library 'avformat-57.dll' (error 126: The specified module could not be found.) 11:27:33: Error: Failed to load shared library 'avformat-55.dll' (error 126: The specified module could not be found.) 11:27:35: Error: Failed to load shared library 'avformat-60.dll' (error 126: The specified module could not be found.) 11:27:35: Error: Failed to load shared library 'avformat-59.dll' (error 126: The specified module could not be found.) 11:27:35: Error: Failed to load shared library 'avformat-58.dll' (error 126: The specified module could not be found.) 11:27:35: Error: Failed to load shared library 'avformat-57.dll' (error 126: The specified module could not be found.) 11:27:35: Error: Failed to load shared library 'avformat-55.dll' (error 126: The specified module could not be found.) 11:27:53: File name is C:\Temp\LetsOurHaleOurSeniorPresident.wav 11:27:53: Mime type is * 11:27:53: Opening with libsndfile 11:27:53: Open(C:\Temp\LetsOurHaleOurSeniorPresident.wav) succeeded 11:27:53: Operation 'Importing WAV (Microsoft)' took 0.028000 seconds. Poll was called 7 times and took 0.000005 seconds. Yield was called 0 times and took 0.000000 seconds. 11:27:53: Operation 'Music Information Retrieval' took 0.000000 seconds. Poll was called 0 times and took 0.000000 seconds. Yield was called 0 times and took 0.000000 seconds. 11:32:03: Operation 'Pre-processing' took 0.000000 seconds. Poll was called 0 times and took 0.000000 seconds. Yield was called 0 times and took 0.000000 seconds. 11:32:04: Error: In Music Separation, exception: Exception from src\inference\src\cpp\core.cpp:126: Exception from src\inference\src\dev\plugin.cpp:54: Check 'false' failed at src\plugins\intel_gpu\src\plugin\program_builder.cpp:179: [GPU] ProgramBuilder build failed! Exception from src\plugins\intel_gpu\src\graph\include\primitive_type_base.h:58: [GPU] Can't choose implementation for convert:crosstransformer.layers.2.self_attn.out_proj.weight node (type=reorder) [GPU] Original name: crosstransformer.layers.2.self_attn.out_proj.weight [GPU] Original type: Convert [GPU] Reason: Check '!kernels.empty()' failed at src\plugins\intel_gpu\src\kernel_selector\kernel_selector.cpp:70: [GPU] Couldn't find a suitable kernel for convert:crosstransformer.layers.2.self_attn.out_proj.weight params raw string: F16_BFYX_v1_p0_0_v1_p0_0_v512_p0_0_v512_p0_0;F32_BFYX_v1_p0_0_v1_p0_0_v512_p0_0_v512_p0_0

11:32:11: Operation 'OpenVINO Music Separation' took 1.348000 seconds. Poll was called 4 times and took 0.024128 seconds. Yield was called 3 times and took 0.019667 seconds.

DataJuggler commented 2 months ago

I solved my issue by installing FFMpeg for Audacity. I had FFMpeg installed to use for C# projects, but I didn't have the Audacity version installed. Perhaps a message during installation or usage if FFMpeg for Audacity is not installed, informing the user might be helpful.

Once installed it worked perfect, and fast.

RyanMetcalfeInt8 commented 2 months ago

Hmm, our plugins don't have any dependencies on FFMpeg -- I don't think it's possible that installing it would have fixed those GPU errors that I see in your log. Perhaps you ran it with CPU and it worked?

DataJuggler commented 2 months ago

One of the people up above had this line in his log file.

19:09:24: FFmpeg libraries loaded successfully from: C:\Program Files\FFmpeg For Audacity\avformat-59.dll

I read that, and realized I didn't have that program installed, and I installed it, and it worked. So, yes it does I believe, or it solved my issue.