Twitch Stream Transcription and Translation Script

This repository contains a Python script that transcribes and translates live audio from a Twitch stream. The script uses OpenAI's Whisper model for transcription and Hugging Face's MarianMT model for translation. It supports dynamic language translation for most languages. It has been tested on 20 so far but should work on hundreds.

In short, if you speak X and want to understand a stream where the streamer speaks Y, this will do that.

If you like this repo, hit the star, help others find it!

Features
Requirements
Installation
Setting Up a Virtual Environment
Configuration
Usage
Adding a Language
How It Works
CUDA vs CPU
Troubleshooting
Exceptions Handling
Contributing
Contact
License

Features

Transcribes live audio from a specified Twitch channel.
Translates transcribed text from a source language to a target language.
Supports dynamic language selection for both source and target languages.
Uses Whisper model for transcription and MarianMT model for translation.
Logs system messages and errors for easy debugging.

Requirements

Python 3.7 or higher
ffmpeg
pip

Installation

Clone the Repository

git clone https://github.com/gorgarp/TwitchTranslate.git
cd TwitchTranslate

Install FFmpeg
- Windows: Download and install from FFmpeg official website.
- macOS: Use Homebrew
```
brew install ffmpeg
```
- Linux: Use your package manager
```
sudo apt-get install ffmpeg
```

Setting Up a Virtual Environment

Using a virtual environment is one approach to running the script. This method keeps your dependencies isolated from your system Python environment.

Create a Virtual Environment
```
python -m venv myenv
```
Activate the Virtual Environment
- Windows
```
myenv\Scripts\activate
```
- macOS/Linux
```
source myenv/bin/activate
```
Install the Required Python Packages
```
pip install -r requirements.txt
```

Configuration

Set Up Twitch API Token
- Go to the Twitch Developer Portal.
- Register your application to get the CLIENT_ID and CLIENT_SECRET.
- Replace YOUR_TWITCH_CLIENT_ID and YOUR_TWITCH_CLIENT_SECRET in the script with your actual Twitch API credentials.
Configure the Twitch Channel
- Replace YOUR_TWITCH_CHANNEL_NAME with the name of the Twitch channel you want to transcribe and translate.

Usage

Run the Script

python transcribe_translate.py <source_lang> <target_lang>

Example:

python transcribe_translate.py es en  # Translates Spanish to English
python transcribe_translate.py pl en  # Translates Polish to English

Adding a Language

Confirmed Languages
- The script currently has been confirmed for the following languages:
```
"en", "fr", "de", "es", "it", "nl", "sv", "pl", "pt", "ru", "zh", "ja", "ko", 
"ar", "tr", "da", "fi", "no", "cs", "el"
```
  Note: These have been tested, but it should work on any language pair found on Helsinki-NLP on Hugging Face.
Add Language Pair to Exceptions List
- If you find a language pair on Hugging Face that does not follow the standard format Helsinki-NLP/opus-mt-{source_lang}-{target_lang}, you need to add an exception.
- Update the exceptions dictionary in the script with the new language pair and the corresponding model name. (See Exceptions Handling)

How It Works

Transcription
- The script captures live audio from a specified Twitch channel using FFmpeg.
- It uses the Whisper model to transcribe the audio into text.
Translation
- The detected language of the transcribed text is checked against the specified source language.
- If it matches, the text is translated into the target language using MarianMT.
- The translated text is printed to the console.
System Messages and Error Handling
- The script logs system messages such as model loading and errors for easy debugging and monitoring.

CUDA vs CPU

The script can run on either CUDA (GPU) or CPU. Using CUDA significantly improves the performance and speed of both transcription and translation.

Checking CUDA Availability
- The script automatically checks if CUDA is available and uses it if possible:
```
device = "cuda" if torch.cuda.is_available() else "cpu"
logging.info(f"Using device: {device}")
```
Installing CUDA (if needed)
- Windows:
  1. Download and install the NVIDIA CUDA Toolkit.
  2. Add CUDA to your PATH:
```
$env:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5\bin"
```
  3. Reboot your system to ensure the changes take effect.
  4. Verify the installation by running:
```
nvcc --version
```
    Note: The above commands reference CUDA version 12.5. If you install a different version, adjust the paths accordingly.
- macOS: CUDA is not supported on macOS.
- Linux:
  1. Download and install the NVIDIA CUDA Toolkit.
  2. Add CUDA to your PATH:
```
export PATH=/usr/local/cuda-12.5/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.5/lib64\
                  ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
```
  3. Verify the installation by running:
```
nvcc --version
```
    Note: The above commands reference CUDA version 12.5. If you install a different version, adjust the paths accordingly.
Installing PyTorch with CUDA Support
- Install PyTorch with CUDA support:
```
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
```
  Note: The above command installs PyTorch with CUDA 11.7 support. Ensure the versions are compatible with your CUDA installation.

Detailed Script Breakdown

Authentication: The script authenticates with the Twitch API using the provided client ID and secret. It obtains an access token required for making API requests.
Fetching Stream Metadata: The script fetches metadata for the specified Twitch channel to check if the channel is live.
Getting Stream URL: The script uses Streamlink to get the best quality stream URL from the Twitch channel.
Capturing Audio: The script uses FFmpeg to capture audio from the Twitch stream.
Transcribing Audio: The Whisper model is used to transcribe the captured audio into text.
Translating Text: The script detects the language of the transcribed text and translates it into the target language using MarianMT if the language matches the specified source language.
Output: The translated text is printed to the console.

Troubleshooting

Common Issues:
- Ensure FFmpeg is installed and added to your system's PATH.
- Ensure you have the correct client ID, client secret, and Twitch channel name in the script.
- Verify CUDA installation if using GPU for better performance.
Logs and Debugging:
- Check the logs for any error messages or system messages to identify issues.
- The script logs system messages and errors for easy debugging and monitoring.

Exceptions Handling

The script uses a default format Helsinki-NLP/opus-mt-{source_lang}-{target_lang} for loading models.
Some language pairs do not follow this format or may not exist under this naming convention. For these, specific exceptions are added.
Example: For Portuguese to English (pt-en), the script uses Helsinki-NLP/opus-mt-mul-en as the model name.
The exceptions are defined in a dictionary and checked during model loading.

Note: Helsinki-NLP on Hugging Face is where you can find language combinations.

Contributing

Fork the Repository
Create a Feature Branch
```
git checkout -b feature-branch
```
Commit Your Changes
```
git commit -m "Add some feature"
```
Push to the Branch
```
git push origin feature-branch
```
Open a Pull Request

Contact

For any questions or issues, please open an issue in the GitHub repository.

License

This project is licensed under the MIT License.

gorgarp / TwitchTranslate

readme