thewh1teagle / vibe

Transcribe on your own!
https://thewh1teagle.github.io/vibe/
MIT License
929 stars 56 forks source link

Failed to get segment with ivrit model #34

Closed Danthig closed 5 months ago

Danthig commented 6 months ago

What happened?

A bug happened!

Steps to reproduce

  1. step one...
  2. step two...

What OS are you seeing the problem on?

Window

Relevant log output

Error in desktop\src-tauri\src\main.rs at line 59: failed to get segment

Caused by:
    Invalid UTF-8 detected in a string from Whisper. Index: 726.

App Version: 0.0.7
Arch: x86_64
Platform: windows
Kernel Version: 10.0.22631
OS: windows
OS Version: 10.0.22631
Models: ivrit-ai--whisper-large-v2-tuned-ggml-model.bin
Default Mode: "C:\\Users\\1234\\AppData\\Local\\github.com.thewh1teagle.vibe\\ivrit-ai--whisper-large-v2-tuned-ggml-model.bin"
thewh1teagle commented 6 months ago

Thanks for reporting!

I see that you used the latest version of vibe. did you installed it manually? Does it works for you with the regular whisper model?

Danthig commented 6 months ago

תודה לך!!! התקנתי מהקישור הזה https://github.com/thewh1teagle/vibe/releases/download/v0.0.7/vibe_0.0.7_x64-setup.exe. לא בדקתי מודל אחר, מכיון שהתמלול הוא בעברית.

thewh1teagle commented 6 months ago

I'm working on fixing it, if you can share sample of audio file which the problem happens with it, will be useful.

Meanwhile you can try use the default model of Vibe software, open settings and choose open models folder, then delete all the models from there (or just move them to another folder) and restart the app. It will download the default model which should work almost in any language.

Danthig commented 6 months ago

יש לציין שבקובץ באורך של 4 דקות הגרסה הקודמת 0.0.6 תמלל מצויין עם המודל עברית AI. ורק בקובץ באורך של 50 דקות יצא השגיאה הקודמת. יכול להיות שהשגיאה היא בגלל האורך של הקובץ? זה הקובץ שניסיתי לתמלל https://www2.kolhalashon.com/#/regularSite/playShiur/37455600/1/0/false הורדה ישירה תוכל להוריד מכאן

תודה רבה

thewh1teagle commented 6 months ago

I'll check that audio file, I don't think that it's related to the length of it, but it's about the fact that it's in Hebrew and whisper libraries have problems with the encoding of Hebrew

Y-PLONI commented 6 months ago

It is related to the length of the file. 3-4 minute files are successfully transcribed. Files that are 20 minutes or older fail, I didn't have time to check files that are 10 minutes old... @thewh1teagle

thewh1teagle commented 5 months ago

Fixed in latest version (0.0.8). You can update from the main window in vibe or download it from thewh1teagle.github.io/vibe Also I tested it on 1 hour video, and it worked. Feel free to reopen the issue, and please let me know if it fixed :)