Open russell-dot-js opened 7 months ago
I can confirm that with CoreML it freezes immediately with large files, but rebuilding without it does not completely solve the problem (it just occurs later with large files).
But this helped me: https://github.com/ggerganov/whisper.cpp/issues/896#issuecomment-1569586018
I tried the above but still ran into some issues. As a workaround, I'm using a script to split large audio files into smaller chunks (here, 1200 seconds aka 20 minutes). This uses ffprobe and ffmpeg to split m4a files.
Usage:
chmod u+x split_audio.sh
./split_audio.sh <path to your m4a file>
After using whisper to transcribe the parts, I put the transcripts of all the parts in one file with cat:
cat file2.txt file3.txt ... >> file1.txt
Script:
#!/bin/bash
# Function to get the duration of the audio file in seconds
get_audio_duration() {
duration=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$1")
echo "$duration"
}
# Function to split the audio file into n-minute parts
split_audio() {
file_path="$1"
segment_duration=1200 # seconds
duration=$(get_audio_duration "$file_path")
file_name="${file_path%.*}"
file_ext="${file_path##*.}"
num_parts=$(echo "$duration / $segment_duration" | bc)
if (( $(echo "$duration % $segment_duration > 0" | bc) )); then
num_parts=$(($num_parts + 1))
fi
for ((i=0; i<num_parts; i++)); do
start_time=$(echo "$i * $segment_duration" | bc)
output_file="${file_name}_part$(($i + 1)).${file_ext}"
ffmpeg -i "$file_path" -ss "$start_time" -t "$segment_duration" -c copy "$output_file"
done
}
# Main script execution
if [[ $# -ne 1 ]]; then
echo "Usage: $0 <path_to_m4a_file>"
exit 1
fi
file_path="$1"
split_audio "$file_path"
See #612 - the error seems to be prevalent when using a CoreML model. Rebuilding without CoreML resolves the issue