Closed LeonMatch closed 1 month ago
Which notebook under "https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks" are you referring to?
Hello, I am referring to recipes, custom_ai_assistant. I'm still having issues with this solution. The Speech Recognition Distil-Whisper model is very inaccurate. It does not recognize simple words. Completely misses it. How do I improve it's quality? I am using Meta-Llama-3-8B-Instr from HuggingFace downloaded into root (custom_ai_assistant) directory.
Can you share a link to the mentioned recipes, please? And share the steps you did - to allow the dev-team and comunity to reproduce and analyze further.
Here is the link to the recipe: https://github.com/openvinotoolkit/openvino_notebooks/tree/recipes/recipes/custom_ai_assistant
Steps to reproduce:
I got an error in step9: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\User\AppData\Local\Temp\tmp2oxvro82\openvino_decoder_model.bin' Despite the error, the model gets created and command in step 10 executes fine. However as I mentioned before, the Speech Recognition "Distil-Whisper" model is very inaccurate. It does not recognize simple words. Completely misses it. I'm sending: "Hello. I am having migraines". It transcribes: "Hello. the theirinds." And then replies: "You're experiencing the hives..."
My Python version is 3.11.9 Thank you for follow up!
Hi @LeonMatch, Thanks for reporting this. I transferred this issue to the OpenVINO Build & Deploy repository as this is a new home for all AI Ref Kits. We recently identified this issue and fixed it. Could you try the latest version of the recipe from this repo?
Hi @adrianboguszewski, Can you please share a link to "OpenVINO Build & Deploy" repository. From which repo should I try it again, from this [openvino_notebooks.git] or [openvino_build_deploy] that you mentioned?
Thank you.
Starting from now, you should use only Build & Deploy repo for recipes
The link: https://github.com/openvinotoolkit/openvino_build_deploy/tree/master/ai_ref_kits/custom_ai_assistant
Thanks @adrianboguszewski, You're saying that you identified and fixed the issue but I see the last commit in Build & Deploy repo code was only 2 weeks ago. Is this correct? Is the fix already there? Can you also collaborate more on what was the issue?
The fix was committed in this: https://github.com/openvinotoolkit/openvino_build_deploy/commit/bbd5411fda07694f25054ca76452fc63b6ab0db7
We found that sometimes Whisper doesn't transcribe correctly when running with AUTO, so use GPU or CPU directly :)
@adrianboguszewski,
I still get an error when running python convert_and_optimize_asr.py --precision int8
in "Model Conversion and Optimization" step of the instructions:
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\User\AppData\Local\Temp\tmpwuvuq7ey\openvino_decoder_model.bin'
I checked the system and don't see openvino_decoder_model.bin used by another process. I also deleted temporary files but it did not help.
Is this error critical?
I was able to complete remaining installation steps and run the app. The transcription is much more accurate now. However the inference is very slow, 16 s for a simple phrase to transcribe and 32 s to respond. Is it normal? My machine has Intel i7 processor with 32G of RAM and Nvidia Quatro I ran other models locally using Ollama with just text to text inference and it was much faster.
@AnishaUdayakumar have you ever encountered the error above during your tests?
@LeonMatch what exact CPU model do you use?
https://colab.research.google.com/drive/1WtoA3aq3lZ88rUWIFltBckXpyvy3IIpH?usp=sharing Here is a simple colab example for converting the llama3 to INT4. You can download the model here after the conversion and should work ;)
Closing this because of lack of activity. Please reopen if still needed.
Discussed in https://github.com/openvinotoolkit/openvino_notebooks/discussions/2191
When I run "Model Conversion and Optimization" step of the instructions, first command:
python convert_and_optimize_asr.py --precision int8
I get an error:PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\User\\AppData\\Local\\Temp\\tmp2oxvro82\\openvino_decoder_model.bin'
Despite the error, the model gets created. The second command in "Model Conversion and Optimization" step runs fine:python convert_and_optimize_chat.py --chat_model_type llama3-8B --precision int4
However I'm having an issue when running the app: The Speech Recognition "Distil-Whisper" model is very inaccurate. It does not recognize simple words. Completely misses it. How do I improve it's quality? I am using Meta-Llama-3-8B-Instr from HuggingFace downloaded into root (custom_ai_assistant) directory.
Thanks, Leon