Open architeck opened 1 week ago
I've been facing the same issue in the last three days. I'm transcribing 1/1.5 hour long wav files. Compared to last week, I cannot select the model anymore (the only option now is 'large-v3-turbo'), and the UI remains stuck on "transcribing". I checked the folders after my files running for 10+ hours, but there are no progress.
Ok, we released the new version 1.2 three days ago, so this is probably a bug that we did not catch in testing and I cannot reproduce it on my end right now. We will have to take a look on the new way we track the progress in the new steps.
Large-v3 turbo is freshly released from OpenAI and offers great balance between speed and accuracy. As for the other models, you now have a new menu point on the left: Model Manager. There you can download all the available whisper models, also large-v2 or large-v3.
Could you try other models and let me know if the UI is also stuck with those?
Same here. It randomly gets stuck with files over 23 minutes long, while files under 20 minutes pose no issues. The file extension doesn't seem to matter.
Same here. It randomly gets stuck with files over 23 minutes long, while files under 20 minutes pose no issues. The file extension doesn't seem to matter.
Ok, I will try to reproduce it.
In the meantime, you can revert back to v1.1 using the installer here: https://business-analytics.uni-graz.at/de/forschung/atrain/download/
Hello, Did you release v1.2 ? Where is the release to download for developers?
Hello, Did you release v1.2 ? Where is the release to download for developers?
Hi, we released the new 1.2 installer on microsoft store last week; I also did a release on github now.
Hi @JuergenFleiss, thanks for the amazing software. Just want to report that I got similar issues using both atrain 1.2 and atrain core 1.2 (using this command line --model "large-v3" --language "en" --speaker_detection --num_speakers "auto-detect" --device "GPU" --compute_type "float16") in my case, changing the model (turbo, large-v3) did not matter It seems 1.1 is working fine - UPDATE: 1.1 crashed too. im investigating if it is an issue with the mp3 file. It was a 2:13h audio; splitting it worked for some files. some chunks show this error:
Command '['aTrain_core', 'transcribe', 'C:\Users\Data analysis\Downloads\segments\output_003.mp3', '--model', 'large-v3', '--language', 'en', '--device', 'GPU', '--compute_type', 'float16', '--speaker_detection', '--num_speakers', 'auto-detect']' returned non-zero exit status 3221226505.
Ok, thats quite strange. Might be related to faster-whisper itself, they also have reports of videos >20min crashing https://github.com/SYSTRAN/faster-whisper/issues/687
Could one of you share the files that crash atrain so that we can investigate (as we are unable to reproduce the problem)? You can reach me under bandas@uni-graz.at
I'm transcribing 2+ hour long wav files and notice that the transcription is completed rather quickly on a Nvidia 3070 using large-v3-turbo model. I can see the completed transcript in the output folder, but the UI remains stuck on "transcribing".
The image below is shows less time, but luckily I checked the folders after my long file run for 5+ hour without and progress.
Edit: I'm actually noticing the stuck UI thing on smaller files as well now.