ZhengYuan-Public commented 1 month ago

Probably a silly question, I think the problem is Model was purged, need to re-create. I've downloaded safetensors from huggingface and place it in the folder models where I pulled this repo. But it's not working. What's the proper/recommended way to download the trained models?

Here is my subgen.env:

TRANSCRIBE_DEVICE='gpu'
WHISPER_MODEL='medium'
CONCURRENT_TRANSCRIPTIONS=2
WHISPER_THREADS=4
MODEL_PATH='./models'
PROCADDEDMEDIA=True
PROCMEDIAONPLAY=True
NAMESUBLANG='aa'
SKIPIFINTERNALSUBLANG=''
WORD_LEVEL_HIGHLIGHT=False
PLEXSERVER=http://192.168.1.158:32400
PLEXTOKEN='**********************************'
JELLYFINSERVER=http://192.168.1.158:8096
JELLYFINTOKEN='************************************************'
WEBHOOKPORT=9000
USE_PATH_MAPPING=True
PATH_MAPPING_FROM='/media'
PATH_MAPPING_TO='/share'
TRANSCRIBE_FOLDERS='/share/Videos | /share/Music'
TRANSCRIBE_OR_TRANSLATE='transcribe'
COMPUTE_TYPE='auto'
DEBUG=True
FORCE_DETECTED_LANGUAGE_TO=''
CLEAR_VRAM_ON_COMPLETE=True
UPDATE=False
APPEND=False
MONITOR=True
USE_MODEL_PROMPT=False
CUSTOM_MODEL_PROMPT=''
LRC_FOR_AUDIO_FILES=True
CUSTOM_REGROUP='cm_sl=84_sl=42++++++1'
DETECT_LANGUAGE_LENGTH=300

I pulled the repo, run pip install -r requirements.txt, and here is the log after running python3 launcher.py

2024-06-07 12:35:14,081 DEBUG: Plex event detected is: media.play
2024-06-07 12:35:14,088 DEBUG: Path of file: /media/Videos/__PT/PT_Movies/DouBan.2021.11.11.Top.250.BluRay.1080p.x265.10bit.MNHD-FRDS/超能陆战队.Big.Hero.6.2014.BluRay.1080p.x265.10bit.4Audio.MNHD-FRDS/Big.Hero.6.2014.BluRay.1080p.x265.10bit.4Audio.MNHD-FRDS.mkv
2024-06-07 12:35:14,088 DEBUG: Updated path: /share/Videos/__PT/PT_Movies/DouBan.2021.11.11.Top.250.BluRay.1080p.x265.10bit.MNHD-FRDS/超能陆战队.Big.Hero.6.2014.BluRay.1080p.x265.10bit.4Audio.MNHD-FRDS/Big.Hero.6.2014.BluRay.1080p.x265.10bit.4Audio.MNHD-FRDS.mkv
2024-06-07 12:35:14,213 DEBUG: No subtitles in '' language found in the video.
2024-06-07 12:35:14,487 INFO: Added Big.Hero.6.2014.BluRay.1080p.x265.10bit.4Audio.MNHD-FRDS.mkv for transcription.
2024-06-07 12:35:14,487 INFO: Transcribing file: Big.Hero.6.2014.BluRay.1080p.x265.10bit.4Audio.MNHD-FRDS.mkv
2024-06-07 12:35:14,487 DEBUG: Model was purged, need to re-create
2024-06-07 12:35:14,495 INFO: Metadata refresh initiated successfully.
2024-06-07 12:35:14,495 INFO: Metadata for item 1791 refreshed successfully.
INFO:     172.17.0.4:44244 - "POST /plex HTTP/1.1" 200 OK

Meanwhile, I also wanna ask:

how to transcribe all videos inside a folder (right now I think it will only start transcribing when a video is played or added into the watched folder)?
"This will transcribe your personal media on a Plex, Emby, or Jellyfin server to create subtitles (.srt) from audio/video files with the following languages and transcribe or translate them into english." Is it possible to specify more language(s) and save them separately? This might be a feature in future releases.

Update: using docker-compose version

==========
== CUDA ==
==========

CUDA Version 12.2.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

subgen.env file not found. Please run prompt_and_save_env_variables() first.
subgen.py exists and UPDATE is set to False, skipping download.
Launching subgen.py
File subgen.env not found. Environment variables not set.
File subgen.env not found. Environment variables not set.
INFO:root:Subgen v2024.5.7.76
INFO:root:Starting Subgen with listening webhooks!
INFO:root:Transcriptions are limited to running 2 at a time
INFO:root:Running 4 threads per transcription
INFO:root:Using cuda to encode
INFO:root:Using faster-whisper
INFO:root:Starting to search folders to see if we need to create subtitles.
INFO:root:Finished searching and queueing files for transcription. Now watching for new files.
INFO:     Started server process [26]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9000 (Press CTRL+C to quit)
INFO:     172.17.0.1:45868 - "POST /plex HTTP/1.1" 200 OK
INFO:     172.17.0.1:45868 - "POST /plex HTTP/1.1" 200 OK
INFO:     172.17.0.1:54460 - "POST /plex HTTP/1.1" 200 OK
INFO:root:Added 汉武大帝.E01.2004.DVDRip.x264.AC3-CMCT.mkv for transcription.
INFO:root:Transcribing file: 汉武大帝.E01.2004.DVDRip.x264.AC3-CMCT.mkv
INFO:root:Metadata refresh initiated successfully.
INFO:root:Metadata for item 3756 refreshed successfully.
INFO:     172.17.0.1:54460 - "POST /plex HTTP/1.1" 200 OK
WARNING:faster_whisper:An error occured while synchronizing the model Systran/faster-whisper-medium from the Hugging Face Hub:
An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again.
WARNING:faster_whisper:Trying to load the model directly from the local cache, if it exists.
INFO:root:Error processing or transcribing /media/Videos/__PT/PT_TVSeries/汉武大帝.2004.58集全.国语.简繁中字￡CMCT小鱼/汉武大帝.E01.2004.DVDRip.x264.AC3-CMCT.mkv: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input.

McCloudS commented 1 month ago

The model being purged is subgen freeing up your ram or VRAM between transcriptions. See CLEAR_VRAM_ON_COMPLETE for more info. You do not need to manually download any model or place it anywhere, that is managed automatically by which model you configure under WHISPER_MODEL.

how to transcribe all videos inside a folder (right now I think it will only start transcribing when a video is played or added into the watched folder)? Using TRANSCRIBE_FOLDERS, which will run on script startup, or you can use the /batch endpoint from http://subgenip:9000/docs to manually choose a folder/files.

Is it possible to specify more language(s) and save them separately? This might be a feature in future releases. The model is only trained to translate INTO english or transcribe into it's own langauge. Right now the only way this is configured is by TRANSCRIBE_OR_TRANSLATE

Your error is clearly related to the model storage/path. Your best bet might be trying to delete the ./models folder and starting from scratch.

ZhengYuan-Public commented 1 month ago

The model being purged is subgen freeing up your ram or VRAM between transcriptions. See CLEAR_VRAM_ON_COMPLETE for more info. You do not need to manually download any model or place it anywhere, that is managed automatically by which model you configure under WHISPER_MODEL.

how to transcribe all videos inside a folder (right now I think it will only start transcribing when a video is played or added into the watched folder)? Using TRANSCRIBE_FOLDERS, which will run on script startup, or you can use the /batch endpoint from http://subgenip:9000/docs to manually choose a folder/files.

Is it possible to specify more language(s) and save them separately? This might be a feature in future releases. The model is only trained to translate INTO english or transcribe into it's own langauge. Right now the only way this is configured is by TRANSCRIBE_OR_TRANSLATE

Your error is clearly related to the model storage/path. Your best bet might be trying to delete the ./models folder and starting from scratch.

Thanks for your reply.

You do not need to manually download any model or place it anywhere, that is managed automatically by which model you configure under WHISPER_MODEL. can be a problem in China due to internet issue...this happens a lot and I might have to download the model mannually. I wonder if you could post the result of ls -al of the model folder so I can know the general structure of the folder (liking naming, etc.).

I've been trying to read the code and find posts on GitHub. So far I noticed your code is using stable-ts and a issue page here point out the model is downloaded from the repo openai/whisper. I've tried to download the model manually and place them in the folder /models, but it seems it still can't be picked up by the code.

McCloudS commented 1 month ago

The models are actually CTranslate2 quantization models @ https://huggingface.co/Systran

Again, you shouldn’t be manually downloading them and letting stable-ts and faster-whisper manage it. They have a particular folder and file structure.

On Fri, Jun 7, 2024 at 9:14 AM Zheng Yuan @.***> wrote:

The model being purged is subgen freeing up your ram or VRAM between transcriptions. See CLEAR_VRAM_ON_COMPLETE for more info. You do not need to manually download any model or place it anywhere, that is managed automatically by which model you configure under WHISPER_MODEL.

how to transcribe all videos inside a folder (right now I think it will only start transcribing when a video is played or added into the watched folder)? Using TRANSCRIBE_FOLDERS, which will run on script startup, or you can use the /batch endpoint from http://subgenip:9000/docs to manually choose a folder/files.

Is it possible to specify more language(s) and save them separately? This might be a feature in future releases. The model is only trained to translate INTO english or transcribe into it's own langauge. Right now the only way this is configured is by TRANSCRIBE_OR_TRANSLATE

Your error is clearly related to the model storage/path. Your best bet might be trying to delete the ./models folder and starting from scratch.

Thanks for your reply.

You do not need to manually download any model or place it anywhere, that is managed automatically by which model you configure under WHISPER_MODEL. can be a problem in China due to internet issue...this happens a lot and I might have to download the model mannually. I wonder if you could post the result of ls -al of the model folder so I can know the general structure of the folder (liking naming, etc.).

I've been trying to read the code and find posts on GitHub. So far I noticed your code is using stable-ts and a issue page here https://github.com/jianfch/stable-ts/issues/282 point out the model is downloaded from the repo openai/whisper https://github.com/openai/whisper/blob/ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab/whisper/__init__.py#L18-L29. I've tried to download the model manually and place them in the folder /models, but it seems it still can't be picked up by the code.

— Reply to this email directly, view it on GitHub https://github.com/McCloudS/subgen/issues/97#issuecomment-2155045035, or unsubscribe https://github.com/notifications/unsubscribe-auth/APJACQO65J5SKGBFVHML4LLZGHE6ZAVCNFSM6AAAAABI56FFP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJVGA2DKMBTGU . You are receiving this because you commented.Message ID: @.***>

ZhengYuan-Public commented 1 month ago

After trying hard to download the model, I finally managed to download it. After solving some additional problems, it finally works! Although it's not recommended to download the model manually as posted by @McCloudS, here is what I've found:

The models folder structure

medium model automatically downloaded

(base) zheng@ubuntu-server-vm:~/Application/subgen/models$ tree -a
.
├── config.json
├── .locks
│   └── models--Systran--faster-whisper-medium
│       ├── 242aa06a480a7b5509375c645097e87af5136774.lock
│       ├── 7818adb6de9fa3064d3ff81226fdd675be1f6344.lock
│       ├── 9b45e1009dcc4ab601eff815b61d80e60ce3fd8c74c1a14f4a282258286b51ae.lock
│       └── c9074644d9d1205686f16d411564729461324b75.lock
├── model.bin
├── README.md
├── tokenizer.json
└── vocabulary.txt

Systran/faster-whisper-medium downloaded with huggingface-cli

(base) zheng@ubuntu-server-vm:~/Application/subgen/models/models--Systran--faster-whisper-medium$ tree
.
├── blobs
│   ├── 242aa06a480a7b5509375c645097e87af5136774
│   ├── 7818adb6de9fa3064d3ff81226fdd675be1f6344
│   ├── 9b45e1009dcc4ab601eff815b61d80e60ce3fd8c74c1a14f4a282258286b51ae
│   └── c9074644d9d1205686f16d411564729461324b75
├── refs
│   └── main
└── snapshots
└── 08e178d48790749d25932bbc082711ddcfdfbc4f
    ├── config.json -> ../../blobs/242aa06a480a7b5509375c645097e87af5136774
    ├── model.bin -> ../../blobs/9b45e1009dcc4ab601eff815b61d80e60ce3fd8c74c1a14f4a282258286b51ae
    ├── tokenizer.json -> ../../blobs/7818adb6de9fa3064d3ff81226fdd675be1f6344
    └── vocabulary.txt -> ../../blobs/c9074644d9d1205686f16d411564729461324b75

The "wrong" method

The automatic method is to some degree also "wrong" because the old model will be overridden everytime you switch between models.

Therefore, if you need to download those models manually, install huggingface-cli and download models with:

$ huggingface-cli download --resume-download ----local-dir ./models Systran/faster-whisper-medium

Here is a relatively simple solution: convert symbolic links of relative path to absolute path inside snapshots/08e178d48790749d25932bbc082711ddcfdfbc4f. Here is a script generated by ChatGPT (NOT TESTED):

#!/bin/bash

# Function to convert a single symbolic link to use an absolute path
convert_symlink() {
    local symlink=$1
    # Check if the file is a symbolic link
    if [ -L "$symlink" ]; then
        # Get the current target of the symbolic link
        local target=$(readlink "$symlink")
        # Resolve the absolute path of the target
        local abs_target=$(realpath -m --relative-base=$(dirname "$symlink") "$target")
        # Rename the existing symbolic link by appending "_backup"
        mv "$symlink" "${symlink}_backup"
        # Create a new symbolic link with the absolute path
        ln -s "$abs_target" "$symlink"
        echo "Converted $symlink to use absolute path: $abs_target"
    else
        echo "$symlink is not a symbolic link."
    fi
}

# Check if a directory argument is provided
if [ -z "$1" ]; then
    echo "Usage: $0 <directory>"
    exit 1
fi

# Get the directory to search for symbolic links
directory=$1

# Find all symbolic links in the specified directory and its subdirectories
find "$directory" -type l | while read -r symlink; do
    convert_symlink "$symlink"
done

After converting the symbolic link to absolute path (and don't change the location of the folder to keep them working), you can just copy them to the model folder and they should work...

Finally, done 😿

Some additional tips

Accelerate model download in China.

I'm using HF-Mirror which can be added with export HF_ENDPOINT=https://hf-mirror.com

Install cuDNN on Ubuntu 24.04 LTS

I encountered the problem Could not load library libcudnn_ops_infer.so.8 and cuDNN can be installed on Ubuntu 24.04 LTS using the same process as in the cuDNN doc for Ubuntu 22.04 LTS. *If you installed in a virtual environment, you also need to link the lib folder.

# Find where the lib folder is located
$ (base) zheng@ubuntu-server-vm:~/Application/subgen$ sudo find / -type f -name "libcudnn_ops_infer*"
/home/zheng/miniconda3/lib/python3.12/site-packages/nvidia/cudnn/lib/libcudnn_ops_infer.so.8

# For bash
$ echo "export LD_LIBRARY_PATH=/home/zheng/miniconda3/lib/python3.12/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH" >> ~/.bashrc
$ source ~/.bashrc

# Verify
$ echo $LD_LIBRARY_PATH
/home/zheng/miniconda3/lib/python3.12/site-packages/nvidia/cudnn/lib:

Reporting some running stats

Logs

(base) zheng@ubuntu-server-vm:~/Application/subgen$ python3 launcher.py 
Environment variables have been loaded from subgen.env
subgen.py exists and UPDATE is set to False, skipping download.
Launching subgen.py
2024-06-08 23:31:09,041 INFO: Subgen v2024.6.4.83
2024-06-08 23:31:09,042 INFO: Starting Subgen with listening webhooks!
2024-06-08 23:31:09,042 INFO: Transcriptions are limited to running 2 at a time
2024-06-08 23:31:09,042 INFO: Running 4 threads per transcription
2024-06-08 23:31:09,042 INFO: Using cuda to encode
2024-06-08 23:31:09,042 INFO: Using faster-whisper
2024-06-08 23:31:09,042 INFO: Starting to search folders to see if we need to create subtitles.
2024-06-08 23:31:09,042 DEBUG: The folders are:
2024-06-08 23:31:09,042 DEBUG: /share/Videos 
2024-06-08 23:31:09,043 DEBUG:  /share/Music
2024-06-08 23:31:09,044 INFO: Finished searching and queueing files for transcription. Now watching for new files.
INFO:     Started server process [64353]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9000 (Press CTRL+C to quit)
2024-06-08 23:31:23,851 DEBUG: Plex event detected is: media.play
2024-06-08 23:31:23,875 DEBUG: Path of file: /media/Videos/__PT/PT_Movies/DouBan.2021.11.11.Top.250.BluRay.1080p.x265.10bit.MNHD-FRDS/教父.The.Godfather.1972.BluRay.1080p.x265.10bit.2Audio.MNHD-FRDS/The.Godfather.1972.BluRay.1080p.x265.10bit.2Audio.MNHD-FRDS.mkv
2024-06-08 23:31:23,875 DEBUG: Updated path: /share/Videos/__PT/PT_Movies/DouBan.2021.11.11.Top.250.BluRay.1080p.x265.10bit.MNHD-FRDS/教父.The.Godfather.1972.BluRay.1080p.x265.10bit.2Audio.MNHD-FRDS/The.Godfather.1972.BluRay.1080p.x265.10bit.2Audio.MNHD-FRDS.mkv
2024-06-08 23:31:23,903 DEBUG: No subtitles in '' language found in the video.
2024-06-08 23:31:23,908 INFO: Added The.Godfather.1972.BluRay.1080p.x265.10bit.2Audio.MNHD-FRDS.mkv for transcription.
2024-06-08 23:31:23,909 INFO: Transcribing file: The.Godfather.1972.BluRay.1080p.x265.10bit.2Audio.MNHD-FRDS.mkv
2024-06-08 23:31:23,909 DEBUG: Model was purged, need to re-create
2024-06-08 23:31:23,913 INFO: Metadata refresh initiated successfully.
2024-06-08 23:31:23,913 INFO: Metadata for item 1297 refreshed successfully.
INFO:     172.17.0.2:55964 - "POST /plex HTTP/1.1" 200 OK
2024-06-08 23:32:43,854 INFO: Processing audio with duration 02:57:09.184
2024-06-08 23:32:58,684 INFO: Detected language 'cy' with probability 0.41
Detected Language: welsh
2024-06-08 23:40:47,297 DEBUG: No speech threshold is met (0.841299 > 0.600000)
Transcribe:  13%|█████████████▊                                                                                         | 1429.76/10629.18 [08:09<24:14,  6.33sec/s]

The language for The Godfather (1972) was detected as cy (welsh) with DETECT_LANGUAGE_LENGTH=300. This can be a potential problem.

nvidia-smi

$ (base) zheng@ubuntu-server-vm:~/Application/subgen$ watch -n 2 nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070        Off |   00000000:01:00.0 Off |                  N/A |
| 55%   82C    P2            103W /  151W |    1983MiB /   8192MiB |     96%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     64353      C   python3                                      1978MiB |
+-----------------------------------------------------------------------------------------+

On a 1070FE (8GB VRAM), CONCURRENT_TRANSCRIPTIONS=2 and WHISPER_THREADS=4, VRAM is used at max about 3GB and transcribe speed varies from ~2.0 to ~6.0 sec/s. Hope those stats can be some help in setting appropriate env vars.

McCloudS / subgen

DEBUG: Model was purged, need to re-create & Feature request #97

The models folder structure

Some additional tips

Accelerate model download in China.

Install cuDNN on Ubuntu 24.04 LTS

Reporting some running stats

Logs

nvidia-smi