Closed syue87 closed 2 months ago
I'd rather fix the underlying problem than mask it behind a process. If I know what the issue is, I can improve the error handling.
I've been running this personally for over a year and haven't had it crash on a single video file. Not to say that it can’t happen, but we’re approaching fringe cases.
For me sugben.py crashed with SIGSEGV 11 at result = model.transcribe_stable(file_path, language=force_language, task=transcription_type, **args). So whisper or a library used by whisper is trying to access an invalid memory address. There is no easy way of error handling this other than masking it behind a process. For me, it works for 99.9% files but crash on the 0.1% is just annoying. If you don't like "mask it behind a process" you can close this issue, and I'm totally fine with it. I can just patch the file myself.
Are you running it in the docker? Previous sigsegv 11 issues were tied to unintentionally having duplicate libs loaded or in rare cases running out of or bad memory.
is there anything special about the file? Is it a massive 4K with DD or something?
Yes, it's running in a docker in Unraid. I think the problem is the audio codec of the file. The video file is avi format, and the audio codec is vo1+. Even VLC does't support this codec, so I'm not expecting whisper to support it. I just hope it won't crash the program.
Yeah vorbis files in AVI are notoriously bad and typically unsupported. What's ffprobe show so I can catch them/ignore them?
I have a libvorbis file that appears to work, but I would guess vor1,vo1+,vor2,vo2+,vor3,vo3+
won't.
ffprobe version 6.1-full_build-www.gyan.dev Copyright (c) 2007-2023 the FFmpeg developers built with gcc 12.2.0 (Rev10, Built by MSYS2 project) configuration: --enable-gpl --enable-version3 --enable-static --pkg-config=pkgconf --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-dxva2 --enable-d3d11va --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint libavutil 58. 29.100 / 58. 29.100 libavcodec 60. 31.102 / 60. 31.102 libavformat 60. 16.100 / 60. 16.100 libavdevice 60. 3.100 / 60. 3.100 libavfilter 9. 12.100 / 9. 12.100 libswscale 7. 5.100 / 7. 5.100 libswresample 4. 12.100 / 4. 12.100 libpostproc 57. 3.100 / 57. 3.100 [avi @ 0000029da9249700] Could not find codec parameters for stream 1 (Audio: none (og[0][0] / 0x676F), 48000 Hz, 2 channels, 128 kb/s): unknown codec Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options Input #0, avi, from 'file.avi': Duration: 01:31:46.70, start: 0.000000, bitrate: 1303 kb/s Stream #0:0: Video: mpeg4 (DX50 / 0x30355844), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 1164 kb/s, 30 fps, 30 tbr, 30 tbn Stream #0:1: Audio: none (og[0][0] / 0x676F), 48000 Hz, 2 channels, 128 kb/s Unsupported codec with id 0 for input stream 1
I updated the check to ignore files with 'none' as the audio. Update=True if you want to give it a shot.
def has_audio(file_path):
try:
if has_image_extension(file_path):
logging.debug(f"{file_path} is an image, skipping processing")
return False
with av.open(file_path) as container:
# Check for an audio stream and ensure it has a valid codec
for stream in container.streams:
if stream.type == 'audio':
# Check if the codec is supported (not 'none')
if stream.codec.name != 'none':
return True
else:
logging.debug(f"Unsupported codec for audio stream in {file_path}")
return False
except (av.AVError, UnicodeDecodeError):
logging.debug(f"Error processing file {file_path}")
return False
I can't replicate that file though, so I don't have a great way to test it.
Launching subgen.py
INFO:root:Subgen v2024.9.8.110
INFO:root:Starting Subgen with listening webhooks!
INFO:root:Transcriptions are limited to running 1 at a time
INFO:root:Running 4 threads per transcription
INFO:root:Using cuda to encode
INFO:root:Using faster-whisper
INFO:root:Starting to search folders to see if we need to create subtitles.
Traceback (most recent call last):
File "/subgen/subgen.py", line 1130, in
If you’re willing, try it again. Made an update to check one more thing. If this doesn’t work, I think you’re better off with your solution for now. It’s nearly impossible to test this without a file having that problem.
It still crashes, but looks like you are just using a wrong attribute name. The stream object does not have a codec attribute, I think you are trying to use codec_context. On your latest version after replacing codec with codec_context, the program no longer crashes. For your reference, when I print dir(stream), this is what I get: ['class', 'delattr', 'dir', 'doc', 'eq', 'format', 'ge', 'getattr', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'ne', 'new', 'pyx_vtable', 'reduce', '__reduce_ex', 'repr', 'setattr', 'setstate', 'sizeof', 'str', 'subclasshook__', 'average_rate', 'base_rate', 'codec_context', 'container', 'decode', 'display_aspect_ratio', 'duration', 'encode', 'frames', 'guessed_rate', 'id', 'index', 'language', 'metadata', 'nb_side_data', 'profile', 'sample_aspect_ratio', 'side_data', 'start_time', 'time_base', 'type']
Thanks for the help. I pushed it.
if stream.codec_context and stream.codec_context.name != 'none':
subgen.py crashes when processing certain files. I'm aware that you have a list of extensions to exclude from processing, however, for me the file it crashed on is an actual video file.
Here is a simple fix for the problem, it works well for me: I created a new function to start the gen_subtitles as a process. If whisper were to crash, it would only crash the process, not the program or the docker container. After a process crash, the program will function as normal since each gen_subtitles runs in a new process.
import multiprocessing
def gen_subtitles(file_path: str, transcription_type: str, force_language=None): process = multiprocessing.Process(target=gen_subtitles_process, args=(file_path, transcription_type, force_language)) process.start() process.join()
// The origional gen_subtitles function def gen_subtitles_process(file_path: str, transcription_type: str, force_language=None) -> None: