machinewrapped / gpt-subtrans

Open Source project using LLMs to translate SRT subtitles
Other
346 stars 39 forks source link

Docker package ? #150

Open Maitresinh opened 6 months ago

Maitresinh commented 6 months ago

Wondering if i could manage to install it on unraid for Bazarr....

machinewrapped commented 6 months ago

I don't know much about docker, you'll have to give me more details :-)

Maitresinh commented 6 months ago

I'm afraid i can't. I'm just a basic user....

IlmariKu commented 6 months ago

I could consider doing this. No promises, but I'll keep it in mind.

IlmariKu commented 5 months ago

Any wishes on how it would operate? Via as a permament service, http-API or via just running docker-image as a task?

Maitresinh commented 5 months ago

Well, for unraid, a docker image seems the better way, don't you think so ? Everyone could use the App community tab to find and install it.

IlmariKu commented 5 months ago

Well, yes, but I don't have any idea what kind of API Bazarr expects from the image

Maitresinh commented 5 months ago

I asked about it here ! https://github.com/morpheus65535/bazarr/issues/2450

hnorgaar commented 5 months ago

Bazarr is written in Python and this is the file for Whisper AI, which I believe will be similar to GPT whisperai.zip

Maitresinh commented 4 months ago

Just checking if the project is still on track. I'm using a lost whisper with bazarr. Just a pitty that the actual integrated translator is not as good at whisper.

Maitresinh commented 3 months ago

ok so, i've asked GPT-4TO how to install GPT-subtrans onto an an uraid server, and use it inside Bazarr ?

Here is his answer. It may be a good base, but i've tried and get an error when creating the container (entering Github link)

  1. Clone the GPT-Subtrans Repository: o Open a terminal and clone the GPT-Subtrans repository:

git clone https://github.com/machinewrapped/gpt-subtrans.git cd gpt-subtrans

  1. Create a Dockerfile: o In the cloned directory, create a Dockerfile with the following content:

dockerfile

FROM python:3.8-slim

WORKDIR /app

COPY . /app

RUN pip install --upgrade pip && pip install -r requirements.txt

EXPOSE 8501

CMD ["streamlit", "run", "gui_subtrans.py"]

o Build the Docker image:

docker build -t gpt-subtrans-gui .

o Run the container:

docker run -d -p 8501:8501 --name=gpt-subtrans-gui -v /mnt/user/appdata/gpt-subtrans/config:/app/config -v /mnt/user/appdata/gpt-subtrans/subtitles:/app/subtitles gpt-subtrans-gui

Verify Container Logs

1. Check Container Logs for Errors: o If the container fails, check the logs for specific errors:


docker logs gpt-subtrans-gui Write a Post-Processing Script

  1. Create a Subtitle Monitoring Script: o Create a bash script to monitor new subtitle files and translate them if they are not in French or Spanish. o Example script: bash

!/bin/bash

SUBTITLES_DIR="/mnt/user/appdata/bazarr/subtitles" PROCESSED_DIR="/mnt/user/appdata/gpt-subtrans/processed"

detect_language() { python3 <<END import sys from langdetect import detect

file_path = sys.argv[1]

def read_subtitle_file(file_path): with open(file_path, 'r', encoding='utf-8') as file: return file.read()

try: text = read_subtitle_file(file_path) lang = detect(text) print(lang) except: print("unknown")

END }

inotifywait -m -e create "$SUBTITLES_DIR" | while read path action file; do if [[ "$file" =~ .srt$ ]]; then echo "New subtitle detected: $file" lang=$(detect_language "$SUBTITLES_DIR/$file") echo "Detected language: $lang"

    if [[ "$lang" != "fr" && "$lang" != "es" ]]; then
        echo "Translating $file to French..."
        python /path/to/gpt-subtrans/gpt_subtrans.py "$SUBTITLES_DIR/$file" --target_language fr
        mv "$SUBTITLES_DIR/$file" "$PROCESSED_DIR/"
    else
        echo "Skipping translation for $file as it is already in $lang"
        mv "$SUBTITLES_DIR/$file" "$PROCESSED_DIR/"
    fi
fi

done

Integrate with Bazarr

1. Configure Bazarr to Use the Subtitles Directory: o Make sure Bazarr is set to save subtitles in the directory monitored by your script.

Example requirements.txt Ensure your requirements.txt includes all necessary dependencies: text Copier le code streamlit langdetect openai

Add other necessary dependencies here

By following these steps, you should be able to install and run GPT-Subtrans with its GUI on Unraid and integrate it with Bazarr using a post-processing script. If you encounter any errors, check the container logs for more information and adjust the script accordingly. If you need further assistance with specific errors, please provide the error messages for more detailed help.

hnorgaar commented 3 months ago

Why not just call "python3 batch_process.py" with added IF statement ointing to his video folder, from Bazarr post procesing? If all required stuff is installed, and as Bazarr is written in Python, I guess that will work just fine. Remember to edit that file to your setup

Maitresinh commented 1 week ago

my skills are very limited. To if you could elaborate a bit on that, what would be great

Maitresinh commented 1 week ago

i've asked GPT about that. Below is tutorial. Any errors ?

Tutorial: Integrate GPT Subtrans with Bazarr Using a Post-Processing Script Objective:

Configure Bazarr to automatically execute batch_process.py after subtitle downloads, with a condition that checks if the video file is located in a specific folder.

Step 1: Prerequisites Bazarr installed and functioning on your system. Python 3 installed on the system where Bazarr is running. The batch_process.py script from GPT Subtrans ready for use. All necessary dependencies for GPT Subtrans installed. Read and write access to the file system where Bazarr and the script operate. Step 2: Prepare the batch_process.py Script Ensure that:

The batch_process.py script is functional and executable from the command line. The script accepts the path to the subtitle file to be processed as an argument. All dependencies for GPT Subtrans are installed (e.g., via pip install gpt-subtrans). Step 3: Create the Post-Processing Script for Bazarr We'll create a shell script that Bazarr will call after downloading subtitles.

3.1. Create the bazarr_postprocess.sh Script

Create a file named bazarr_postprocess.sh in a directory accessible by Bazarr, for example, /usr/local/bin/bazarr_postprocess.sh.

Script Content:

bash Copier le code

!/bin/bash

Paths passed as arguments by Bazarr

SUBTITLE_PATH="$1" VIDEO_PATH="$2"

Path to the target video folder

TARGET_VIDEO_FOLDER="/path/to/your/video/folder"

Check if the video file is in the target folder

if [[ "$VIDEO_PATH" == "$TARGET_VIDEO_FOLDER"* ]]; then echo "The video file is in the target folder. Executing GPT Subtrans."

# Run 'batch_process.py' on the subtitle file
python3 /path/to/batch_process.py "$SUBTITLE_PATH"

# Check if the script executed successfully
if [[ $? -eq 0 ]]; then
    echo "GPT Subtrans processed the subtitle successfully."
else
    echo "Error executing GPT Subtrans."
fi

else echo "The video file is not in the target folder. The script will not be executed." fi Instructions:

Replace /path/to/your/video/folder with the actual path to your video folder.

Replace /path/to/batch_process.py with the actual path to your batch_process.py script.

Ensure that the batch_process.py script is executable and works correctly.

Make the script executable:

bash Copier le code chmod +x /usr/local/bin/bazarr_postprocess.sh Step 4: Configure Bazarr to Use the Post-Processing Script Access Bazarr Settings:

Open Bazarr's web interface. Go to "Settings" > "Processing". Enable Post-Processing Scripts:

Enable the "Post-Processing Scripts" option.

In the "Post-Processing Script" field, enter the path to your script:

bash Copier le code /usr/local/bin/bazarr_postprocess.sh In the script's "Arguments" field, add the following variables:

perl Copier le code "%SUBTITLE_PATH%" "%VIDEO_PATH%" This allows Bazarr to pass the subtitle file path and the video file path to your script.

Save the Settings:

Click "Save" to apply your changes. Step 5: Test the Configuration Manually Test the Script:

Run the script from the command line to ensure it works correctly:

bash Copier le code /usr/local/bin/bazarr_postprocess.sh "/path/to/example.srt" "/path/to/your/video/folder/example.mkv" Check the messages displayed to ensure GPT Subtrans is executed when the video file is in the target folder.

Test with Bazarr:

Download a subtitle via Bazarr for a video file located in the target folder. Check Bazarr's logs to ensure the post-processing script is executed. Step 6: Additional Considerations 6.1. Permission Management:

Ensure that the user running Bazarr has the necessary permissions to:

Execute the bazarr_postprocess.sh script. Read and write to the folders where the subtitle and video files are located. Access the Python interpreter and required modules. 6.2. Environment Variables:

If GPT Subtrans requires environment variables (e.g., an OpenAI API key), ensure they are available in the environment where the script runs.

You can export these variables at the beginning of the script:

bash Copier le code export OPENAI_API_KEY="your_api_key" 6.3. Logging and Debugging:

To facilitate debugging, you can redirect the script's output to a log file:

bash Copier le code

Add to your script

LOG_FILE="/path/to/bazarr_postprocess.log" exec >> "$LOG_FILE" 2>&1 This will record all script messages in the specified file.

Step 7: Complete Example of the Script with Logging bash Copier le code

!/bin/bash

Paths passed as arguments by Bazarr

SUBTITLE_PATH="$1" VIDEO_PATH="$2"

Path to the target video folder

TARGET_VIDEO_FOLDER="/path/to/your/video/folder"

Log file

LOG_FILE="/path/to/bazarr_postprocess.log"

Log the date and time

echo "-----------------------------------" >> "$LOG_FILE" echo "Date: $(date)" >> "$LOG_FILE"

Check if the video file is in the target folder

if [[ "$VIDEO_PATH" == "$TARGET_VIDEO_FOLDER"* ]]; then echo "The video file is in the target folder. Executing GPT Subtrans." >> "$LOG_FILE"

# Run 'batch_process.py' on the subtitle file
python3 /path/to/batch_process.py "$SUBTITLE_PATH" >> "$LOG_FILE" 2>&1

# Check if the execution was successful
if [[ $? -eq 0 ]]; then
    echo "GPT Subtrans processed the subtitle successfully." >> "$LOG_FILE"
else
    echo "Error executing GPT Subtrans." >> "$LOG_FILE"
fi

else echo "The video file is not in the target folder. The script will not be executed." >> "$LOG_FILE" fi Step 8: Final Verification Test the script again to ensure everything works as expected. Check the log file to verify that messages are correctly recorded. Advantages of This Approach Simplicity: No need for complex configurations or major changes in Bazarr. Control: Using the if condition allows you to decide when the script is executed. Integration: Leverages Bazarr's existing functionalities for seamless integration. Customization: You can adapt the script to meet specific needs (e.g., different target folders, multiple conditions).

hnorgaar commented 1 week ago

To make things easier you can use "python3 /path/to/batch_process.py {{subtitles}}

Below are the arguments Bazarr can use

directory Full path of the episode file parent directory episode Full path of the episode file episode_name Filename of the episode without parent directory or extension subtitles Full path of the subtitles file subtitles_language Language of the subtitles file (may include HI or forced) subtitles_language_code2 2-letter ISO-639 language code of the subtitles language (may include :hi or :forced) subtitles_language_code2_dot 2-letter ISO-639 language code of the subtitles language (same as previous but with dot separator instead of colon) subtitles_language_code3 3-letter ISO-639 language code of the subtitles language (may include :hi or :forced) subtitles_language_code3_dot 3-letter ISO-639 language code of the subtitles language (same as previous but with dot separator instead of colon) episode_language Audio language of the episode file episode_language_code2 2-letter ISO-639 language code of the episode audio language episode_language_code3 3-letter ISO-639 language code of the episode audio language score Score of the subtitle file subtitle_id Provider ID of the subtitle file provider Provider of the subtitle file uploader Uploader of the subtitle file release_info Release info for the subtitle file series_id Sonarr series ID (Empty if movie) episode_id Sonarr episode ID or Radarr movie ID

Maitresinh commented 1 week ago

i've tried with that (giving these arguments to GPT). But until now, could not get it work.

Here is a detailed step-by-step guide for installing GPT-Subtrans with Bazarr, including configuration for automatic subtitle translation and manual subtitle selection using a GUI or frontend.

Step 1: Setting up the Environment Ensure Docker is installed on your unRAID server if it’s not already set up. Install Bazarr on unRAID: Configure Bazarr to manage your media library's subtitles by linking it to your downloaders (Transmission, etc.) and media managers (Sonarr, Radarr). Step 2: Create the Docker Container for GPT-Subtrans

  1. Create a Dockerfile for GPT-Subtrans Create a directory for GPT-Subtrans:

bash Copier le code mkdir -p /mnt/user/appdata/gpt-subtrans-gui cd /mnt/user/appdata/gpt-subtrans-gui Create a Dockerfile in this directory:

bash Copier le code nano Dockerfile Add the following content to the Dockerfile:

dockerfile Copier le code FROM python:3.10-slim

Install git, fluxbox, xvfb, and other dependencies for GUI

RUN apt-get update && apt-get install -y git fluxbox x11vnc xvfb

Clone the GPT-Subtrans repository

RUN git clone https://github.com/machinewrapped/gpt-subtrans.git /app/gpt-subtrans

Install GPT-Subtrans dependencies

WORKDIR /app/gpt-subtrans RUN bash install.sh

Install Flask for frontend (optional)

RUN pip install flask

Install necessary translation SDKs

RUN pip install openai google.generativeai anthropic

Start a VNC server to allow GUI access

CMD ["bash", "-c", "xvfb-run -s '-screen 0 1024x768x16' python3 gui-subtrans.py"] Save and exit (CTRL + X, then Y, then Enter).

  1. Build the Docker Image Navigate to the directory where the Dockerfile is located:

bash Copier le code cd /mnt/user/appdata/gpt-subtrans-gui Build the Docker image from the Dockerfile:

bash Copier le code docker build -t gpt-subtrans-gui:latest .

  1. Configure the OpenAI API Key You need to include your OpenAI API key for GPT-Subtrans to function. You can do this either by using a .env file or as an environment variable in Docker.

Option 1: Use a .env File for the API Key Create a .env file in the GPT-Subtrans directory:

bash Copier le code nano /mnt/user/appdata/gpt-subtrans-gui/.env Add your OpenAI API key to the file:

bash Copier le code OPENAI_API_KEY=your_openai_api_key Save and exit (CTRL + X, then Y, then Enter).

Option 2: Pass the API Key as an Environment Variable If you prefer not to use a .env file, you can pass the API key as a Docker environment variable during container creation.

  1. Create and Start the Docker Container Create a Docker container from the image:

If you used the .env file for the API key:

bash Copier le code docker create \ --name=gpt-subtrans \ -v /mnt/user/data/media:/media \ -p 5900:5900 \ gpt-subtrans-gui:latest OR, if you pass the API key as an environment variable:

bash Copier le code docker create \ --name=gpt-subtrans \ -v /mnt/user/data/media:/media \ -e OPENAI_API_KEY="your_openai_api_key" \ -p 5900:5900 \ gpt-subtrans-gui:latest -p 5900:5900: Exposes the port for accessing the GUI via VNC. OPENAI_API_KEY="your_openai_api_key": Replace with your actual OpenAI API key if passing it directly. Start the container:

bash Copier le code docker start gpt-subtrans Access the GUI (Optional):

Use a VNC client (e.g., RealVNC or TightVNC) to connect to your unRAID server at :5900. The GUI will allow you to manually select and translate subtitles using GPT-Subtrans. Step 3: Automate Subtitle Translation with Bazarr

  1. Create a Post-Processing Script for Bazarr Create a post-processing script for Bazarr to automatically invoke GPT-Subtrans after subtitles are downloaded:

bash Copier le code nano /mnt/user/appdata/bazarr/gpt_subtrans_postprocess.sh Add the following code to the script:

bash Copier le code

!/bin/bash

Environment variables passed by Bazarr

SUBTITLE_PATH="${SUBTITLE_PATH}" SUBTITLE_LANGUAGE_CODE3="${SUBTITLE_LANGUAGE_CODE3}"

echo "Subtitle file path: $SUBTITLE_PATH" echo "Subtitle language: $SUBTITLE_LANGUAGE_CODE3"

if [[ "$SUBTITLE_LANGUAGE_CODE3" == "fre" ]]; then echo "Subtitles are already in French. No translation needed." else echo "Translating subtitles $SUBTITLE_PATH to French." docker exec gpt-subtrans python3 /app/gpt-subtrans/batch_process.py "$SUBTITLE_PATH" --target-lang fr fi Make the script executable:

bash Copier le code chmod +x /mnt/user/appdata/bazarr/gpt_subtrans_postprocess.sh

  1. Configure Bazarr to Use the Script Open Bazarr's web interface.

Go to Settings > Processing > Post-Processing Scripts.

Add the path to the post-processing script:

bash Copier le code /mnt/user/appdata/bazarr/gpt_subtrans_postprocess.sh Bazarr will now trigger GPT-Subtrans to translate the subtitles after downloading them.

Optional: Set Up a Frontend for Subtitle Selection If you want a more visual way to choose which subtitles to translate, you can set up a Flask-based frontend.

  1. Create a Flask Frontend for Subtitle Selection Create an application directory:

bash Copier le code mkdir /mnt/user/appdata/gpt-subtrans-gui/app cd /mnt/user/appdata/gpt-subtrans-gui/app Create a Python file for the Flask app:

bash Copier le code nano app.py Add the following code:

python Copier le code from flask import Flask, render_template, request import os import subprocess

app = Flask(name)

@app.route('/') def index():

List all .srt files in the /media directory

subtitles = []
for root, dirs, files in os.walk("/media"):
    for file in files:
        if file.endswith(".srt"):
            subtitles.append(os.path.join(root, file))
return render_template('index.html', subtitles=subtitles)

@app.route('/translate', methods=['POST']) def translate(): subtitle_file = request.form['subtitle']

Call GPT-Subtrans to translate the subtitles

subprocess.call(['docker', 'exec', 'gpt-subtrans', 'python3', '/app/gpt-subtrans/batch_process.py', subtitle_file, '--target-lang', 'fr'])
return f"Translating {subtitle_file} to French"

if name == 'main': app.run(debug=True, host='0.0.0.0', port=5000) Create the HTML template in the templates/ directory:

bash Copier le code mkdir templates nano templates/index.html Add the following HTML:

html Copier le code <!DOCTYPE html>

GPT-Subtrans Frontend

Available Subtitles

Rebuild the Docker image with Flask:

bash Copier le code docker build -t gpt-subtrans-gui:latest . Start the Flask frontend:

The Flask app will be available at http://:5000, allowing you to select subtitles manually for translation.