McCloudS / subgen

Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr
MIT License
546 stars 49 forks source link

subgen.py Crashing #117

Closed syue87 closed 4 days ago

syue87 commented 1 week ago

subgen.py crashes when processing certain files. I'm aware that you have a list of extensions to exclude from processing, however, for me the file it crashed on is an actual video file.

Here is a simple fix for the problem, it works well for me: I created a new function to start the gen_subtitles as a process. If whisper were to crash, it would only crash the process, not the program or the docker container. After a process crash, the program will function as normal since each gen_subtitles runs in a new process.

import multiprocessing

def gen_subtitles(file_path: str, transcription_type: str, force_language=None): process = multiprocessing.Process(target=gen_subtitles_process, args=(file_path, transcription_type, force_language)) process.start() process.join()

// The origional gen_subtitles function def gen_subtitles_process(file_path: str, transcription_type: str, force_language=None) -> None:

McCloudS commented 1 week ago

I'd rather fix the underlying problem than mask it behind a process. If I know what the issue is, I can improve the error handling.

I've been running this personally for over a year and haven't had it crash on a single video file. Not to say that it can’t happen, but we’re approaching fringe cases.

syue87 commented 1 week ago

For me sugben.py crashed with SIGSEGV 11 at result = model.transcribe_stable(file_path, language=force_language, task=transcription_type, **args). So whisper or a library used by whisper is trying to access an invalid memory address. There is no easy way of error handling this other than masking it behind a process. For me, it works for 99.9% files but crash on the 0.1% is just annoying. If you don't like "mask it behind a process" you can close this issue, and I'm totally fine with it. I can just patch the file myself.

McCloudS commented 1 week ago

Are you running it in the docker? Previous sigsegv 11 issues were tied to unintentionally having duplicate libs loaded or in rare cases running out of or bad memory.

is there anything special about the file? Is it a massive 4K with DD or something?

syue87 commented 1 week ago

Yes, it's running in a docker in Unraid. I think the problem is the audio codec of the file. The video file is avi format, and the audio codec is vo1+. Even VLC does't support this codec, so I'm not expecting whisper to support it. I just hope it won't crash the program.

McCloudS commented 1 week ago

Yeah vorbis files in AVI are notoriously bad and typically unsupported. What's ffprobe show so I can catch them/ignore them?

McCloudS commented 1 week ago

I have a libvorbis file that appears to work, but I would guess vor1,vo1+,vor2,vo2+,vor3,vo3+ won't.

syue87 commented 1 week ago

ffprobe version 6.1-full_build-www.gyan.dev Copyright (c) 2007-2023 the FFmpeg developers built with gcc 12.2.0 (Rev10, Built by MSYS2 project) configuration: --enable-gpl --enable-version3 --enable-static --pkg-config=pkgconf --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-dxva2 --enable-d3d11va --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint libavutil 58. 29.100 / 58. 29.100 libavcodec 60. 31.102 / 60. 31.102 libavformat 60. 16.100 / 60. 16.100 libavdevice 60. 3.100 / 60. 3.100 libavfilter 9. 12.100 / 9. 12.100 libswscale 7. 5.100 / 7. 5.100 libswresample 4. 12.100 / 4. 12.100 libpostproc 57. 3.100 / 57. 3.100 [avi @ 0000029da9249700] Could not find codec parameters for stream 1 (Audio: none (og[0][0] / 0x676F), 48000 Hz, 2 channels, 128 kb/s): unknown codec Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options Input #0, avi, from 'file.avi': Duration: 01:31:46.70, start: 0.000000, bitrate: 1303 kb/s Stream #0:0: Video: mpeg4 (DX50 / 0x30355844), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 1164 kb/s, 30 fps, 30 tbr, 30 tbn Stream #0:1: Audio: none (og[0][0] / 0x676F), 48000 Hz, 2 channels, 128 kb/s Unsupported codec with id 0 for input stream 1

McCloudS commented 1 week ago

I updated the check to ignore files with 'none' as the audio. Update=True if you want to give it a shot.

def has_audio(file_path):
    try:
        if has_image_extension(file_path):
            logging.debug(f"{file_path} is an image, skipping processing")
            return False

        with av.open(file_path) as container:
            # Check for an audio stream and ensure it has a valid codec
            for stream in container.streams:
                if stream.type == 'audio':
                    # Check if the codec is supported (not 'none')
                    if stream.codec.name != 'none':
                        return True
                    else:
                        logging.debug(f"Unsupported codec for audio stream in {file_path}")
            return False

    except (av.AVError, UnicodeDecodeError):
        logging.debug(f"Error processing file {file_path}")
        return False

I can't replicate that file though, so I don't have a great way to test it.

syue87 commented 1 week ago

Launching subgen.py INFO:root:Subgen v2024.9.8.110 INFO:root:Starting Subgen with listening webhooks! INFO:root:Transcriptions are limited to running 1 at a time INFO:root:Running 4 threads per transcription INFO:root:Using cuda to encode INFO:root:Using faster-whisper INFO:root:Starting to search folders to see if we need to create subtitles. Traceback (most recent call last): File "/subgen/subgen.py", line 1130, in transcribe_existing(transcribe_folders) File "/subgen/subgen.py", line 863, in transcribe_existing gen_subtitles_queue(path_mapping(file_path), transcribe_or_translate, forceLanguage) File "/subgen/subgen.py", line 626, in gen_subtitles_queue if not has_audio(file_path): File "/subgen/subgen.py", line 822, in has_audio if stream.codec.name != 'none': File "av/audio/stream.pyx", line 15, in av.audio.stream.AudioStream.getattr AttributeError: 'NoneType' object has no attribute 'codec' Traceback (most recent call last): File "/subgen/launcher.py", line 168, in main() File "/subgen/launcher.py", line 163, in main subprocess.run([f'{python_cmd}', '-u', 'subgen.py'], check=True) File "/usr/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['python3', '-u', 'subgen.py']' returned non-zero exit status 1.

McCloudS commented 1 week ago

If you’re willing, try it again. Made an update to check one more thing. If this doesn’t work, I think you’re better off with your solution for now. It’s nearly impossible to test this without a file having that problem.

syue87 commented 1 week ago

It still crashes, but looks like you are just using a wrong attribute name. The stream object does not have a codec attribute, I think you are trying to use codec_context. On your latest version after replacing codec with codec_context, the program no longer crashes. For your reference, when I print dir(stream), this is what I get: ['class', 'delattr', 'dir', 'doc', 'eq', 'format', 'ge', 'getattr', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'ne', 'new', 'pyx_vtable', 'reduce', '__reduce_ex', 'repr', 'setattr', 'setstate', 'sizeof', 'str', 'subclasshook__', 'average_rate', 'base_rate', 'codec_context', 'container', 'decode', 'display_aspect_ratio', 'duration', 'encode', 'frames', 'guessed_rate', 'id', 'index', 'language', 'metadata', 'nb_side_data', 'profile', 'sample_aspect_ratio', 'side_data', 'start_time', 'time_base', 'type']

McCloudS commented 1 week ago

Thanks for the help. I pushed it. if stream.codec_context and stream.codec_context.name != 'none':