Closed ga-it closed 4 months ago
Managed to get one transcription of a smaller wav file.
With a larger wav file, a temp file is created (145mb in tmp vs 800mb original wav file), I pipe the results to a txt file which is created.
After a while the temp file and text file disappear but the processes continue to run.
Hi, it is expected that it takes much time for such large audio files. 145mb file you see in tmp is a 16kHz converted wav file from the original 800mb file since that is supported by whisper. What happened is that the new file got deleted when tmp got cleared out automatically but the audio was being worked on since it was loaded into memory by whisper and theoretically should work.
And I don't think the text file was ever created since the process did not complete (if you're talking about shell redirection >
).
If you need results quicker, try using faster-whisper (not a NC app). Or you could split the file into chunks and perform inference on those.
This app is now deprecated in favor of https://github.com/nextcloud-releases/stt_whisper2 which is documented at https://docs.nextcloud.com/server/latest/admin_manual/ai/index.html
Trying to run stt_whisper against files from the command line
Attempting the following gives no output and continues to run at elevated memory and CPU usage until interrupted.
sudo -u www-data -E /var/www/html/occ stt_whisper:transcribe /mnt/ncdata/admin/files/Documents/audio-sample-1.mp3 -vvv
ps -ef shows: php /var/www/html/occ stt_whisper:transcribe /mnt/ncdata/admin/files/Documents/audio-sample-1.mp3 -vvv /var/www/html/apps/stt_whisper/lib/Service/../../bin/main-musl -m ../../models/medium -t 4 -l auto --no-timestamps /tmp/oc_tmp_EFLAoA-.wav
stt_whisper downloaded and compiled from github within Nextcloud AIO.
Nextcloud AIO v7.4.1 Nextcloud 27.1.2 Operating System: Linux 6.5.0-1-amd64 x86_64 CPU: Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz (60 cores) Memory: 245.77 GB
stt_whisper and stt_helper installed as follows:
apk add make which tar curl npm gcc nano g++ git clone https://github.com/nextcloud/stt_whisper chown -R www-data:www-data stt_whisper cd stt_whisper git clone https://github.com/ggerganov/whisper.cpp.git make cd .. git clone https://github.com/nextcloud/stt_helper chown -R www-data:www-data stt_helper cd stt_helper make