CUDA failed with error out of memory

zorbaTheRainy commented 4 months ago

Get the error:

INFO:root:Error processing or transcribing /movies/The Hollywood Revue of 1929 (1929)/The.Hollywood.Revue.of.1929.1929.DVDRip.XviD-BBM(iLC).avi: CUDA failed with error out of memory compose file includes - "TRANSCRIBE_DEVICE=gpu" I request that if the transcription fails with this error, it falls back to using the cpu.

Otherwise, thanks so much for this program. I started to try Whisper on my own and was not happy with the results. Thank you so much for doing all the hard work for me.

McCloudS commented 4 months ago

Do you have a GPU mapped to your container? I can try to work on checks to fail to CPU in the near term.

On Mon, Mar 25, 2024 at 12:01 PM zorbaTheRainy @.***> wrote:

Get the error:

INFO:root:Error processing or transcribing /movies/The Hollywood Revue of 1929 (1929)/The.Hollywood.Revue.of.1929.1929.DVDRip.XviD-BBM(iLC).avi: CUDA failed with error out of memory compose file includes

"TRANSCRIBE_DEVICE=gpu" I request that if the transcription fails with this error, it falls back to using the cpu.

Otherwise, thanks so much for this program. I started to try Whisper on my own and was not happy with the results. Thank you so much for doing all the hard work for me.

— Reply to this email directly, view it on GitHub https://github.com/McCloudS/subgen/issues/72, or unsubscribe https://github.com/notifications/unsubscribe-auth/APJACQLQYBVYQHYJVAU5WW3Y2BQ6PAVCNFSM6AAAAABFHPFUWSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDMMZWGQYTMMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zorbaTheRainy commented 4 months ago

Oh yes.
GPU works fine for other movies.

This one is just trouble. EDIT: Actually another movie just failed (H265 MKV). So, I assume it is the lack of VRAM in my old GPU. The other movies were just simpler (480p AVI).

It is only a GTX 1060 3Gb, So, it isn't the best GPU. But for other movies it works fine.

It would be nice if you (1) made an ARM64 (cpu) image, and (2) made a gpu image for AMD/ Intel N100 devices. But I am guessing the "fall back to CPU on failure" would be easier.

hnorgaar commented 4 months ago

I have same card as you and never had any fails with medium model. Dont go higher than that

zorbaTheRainy commented 4 months ago

I'll post my compose file

#docker-compose.yml
version: '2'
services:
  subgen:
    container_name: subgen
    tty: true
    image: mccloud/subgen
    environment:
       - "TRANSCRIBE_DEVICE=gpu"
       - "WHISPER_MODEL=medium"
       - "WHISPER_THREADS=4"
       - "PROCADDEDMEDIA=True"
       - "PROCMEDIAONPLAY=False"
       - "NAMESUBLANG=aa"
       - "SKIPIFINTERNALSUBLANG=eng"
       # - "PLEXTOKEN=plextoken"
       # - "PLEXSERVER=http://plexserver:32400"
       # - "JELLYFINTOKEN=token here"
       # - "JELLYFINSERVER=http://jellyfin:8096"
       - "WEBHOOKPORT=9010"
       - "CONCURRENT_TRANSCRIPTIONS=1"
       - "WORD_LEVEL_HIGHLIGHT=False"
       - "DEBUG=True"
       - "USE_PATH_MAPPING=False"
       - "PATH_MAPPING_FROM=/tv"
       - "PATH_MAPPING_TO=/Volumes/TV"
       - "CLEAR_VRAM_ON_COMPLETE=True"
       - "HF_TRANSFORMERS=False"
       - "HF_BATCH_SIZE=24"
       - "MODEL_PATH=./models"
       - "UPDATE=False"
       - "APPEND=False"
       - "TRANSCRIBE_FOLDERS=/tv|/movies"
       - "MONITOR=True"
    volumes:
       - 'D:\docker\subgen\tv:/tv'
       - 'D:\docker\subgen\movies:/movies'
       - 'D:\docker\subgen\models:/subgen/models'
    ports:
       - "9010:9010"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

and the intro part of the docker log

03/26/2024
07:26:20 AM
==========
03/26/2024
07:26:20 AM
== CUDA ==
03/26/2024
07:26:20 AM
==========
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
CUDA Version 12.2.2
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
03/26/2024
07:26:20 AM
By pulling and using the container, you accept the terms and conditions of this license:
03/26/2024
07:26:20 AM
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
Environment variable UPDATE is not set or set to False, skipping download.
03/26/2024
07:26:22 AM
INFO:root:Subgen v2024.3.19.17
03/26/2024
07:26:22 AM
INFO:root:Starting Subgen with listening webhooks!
03/26/2024
07:26:22 AM
INFO:root:Transcriptions are limited to running 1 at a time
03/26/2024
07:26:22 AM
INFO:root:Running 4 threads per transcription
03/26/2024
07:26:22 AM
INFO:root:Using cuda to encode
03/26/2024
07:26:22 AM
INFO:root:Using faster-whisper
03/26/2024
07:26:22 AM
INFO:root:Starting to search folders to see if we need to create subtitles.
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 5 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 6 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 5 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 6 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
INFO:root:Added Z.1969.Multi.Complete.Bluray-Oldham.mkv for transcription.
03/26/2024
07:26:22 AM
INFO:root:1 files in the queue for transcription
03/26/2024
07:26:22 AM
INFO:root:Transcribing file: Z.1969.Multi.Complete.Bluray-Oldham.mkv

BTW, if you translate vs transcribe, the docker log does not properly update the percentages. Instead it waits until the whole file is done to output the 1%...2%...3% information.

McCloudS commented 3 months ago

There is no graceful way to handle this, as we have to know the architecture before loading the model. Your best bet is using a lower model size or manually handling your OOM issues. You could also try distil-medium.en, as it has a slightly smaller memory footprint.

zorbaTheRainy commented 3 months ago

OK Thanks for looking into it.

McCloudS / subgen

CUDA failed with error out of memory #72