下载 YouTube 视频(或提供您自己的视频)并使用 Whisper 和翻译API生成双语字幕,中文文档请见 中文
This project is a Python script that downloads a YouTube video (or uses a local video file), transcribes it, translates the transcript into a target language, and generates a video with dual subtitles (original and translated). The transcription and translation are powered by the Whisper model and the translation API (M2M100, google, GPT3.5), respectively.
GPT-3.5 translation compared to Google Translate
Arguments:
Additionally, when running the script for first time, it will download the following pre-trained models:
pip install -r requirements.txt
You can provide either a YouTube URL or a local video file for processing. The script will transcribe the video, translate the transcript, and generate dual subtitles in the form of an SRT file.
python main.py --youtube_url [YOUTUBE_URL] --target_language [TARGET_LANGUAGE] --model [WHISPER_MODEL] --translation_method [TRANSLATION_METHOD]
--youtube_url: The URL of the YouTube video.
--local_video: The path to the local video file.
--target_language: The target language for translation (default: 'zh').
--model: Choose one of the Whisper models (default: 'small', choices: ['tiny', 'base', 'small', 'medium', 'large']).
--translation_method: The method to use for translation. (default: 'google', choices: ['m2m100', 'google', 'whisper', 'gpt', 'no_translate']).
--no_transcribe: Skip the transcription step. Assume there is a SRT file with the same name as the video file
Note: You must provide either --youtube_url or --local_video, but not both.
To download a YouTube video, transcribe it, and generate subtitles in target language using the google api to translate:
python main.py --youtube_url [YOUTUBE_URL] --target_language 'zh' --model 'small' --translation_method 'google'
To process a local video file, transcribe it, and generate subtitles in target language using gpt3.5-16k (you will need to provide an OpenAI API key)):
python main.py --local_video [VIDEO_FILE_PATH] --target_language 'zh' --model 'medium' --translation_method 'gpt'
The script will generate the following output files in the same directory as the input video:
This script translates subtitles using OpenAI's GPT-3.5 language model. It requires an OpenAI API key to function. In most cases, GPT-based translation produce much better results compared to Google Translate, especially when dealing with context-specific translations or idiomatic expressions. This script aims to provide an alternative method for translating subtitles when traditional translation services like Google Translate do not produce satisfactory results.
OPENAI_API_KEY=your_api_key_here
Replace your_api_key_here with the API key you obtained from OpenAI.
python translate_gpt.py --input_file INPUT_FILE_PATH [--batch_size BATCH_SIZE] [--target_language TARGET_LANGUAGE] [--source_language SOURCE_LANGUAGE] [--video_info VIDEO_INFO] [--model MODEL_NAME] [--no_mapping] [--load_tmp_file]
You can check the response.log
file in the folder containing the input video file for live updates, similar to the experience with ChatGPT.
Note:
Video Information: The --video_info
argument accepts details in any language. It can be used to inform the GPT model about the video's content, improving the translation of context-specific terms, such as proper nouns within a game. For instance, if translating a video related to gaming, you might instruct GPT to use precise translations for in-game terminology.
Translation Mapping: This functionality maintains consistency for frequently used terms by storing source-target translation pairs. When enabled, it prevents variations in translating terms like proper nouns and technical jargon across the video. Disable this with the --no_mapping
flag if preferred.
Resuming Translations: Use the --load_tmp_file
flag to continue a translation task from where it was previously interrupted. The script saves progress in tmp_subtitles.json
, allowing for a seamless resumption without redoing prior work.
Language Support: While the script excels with English-to-Simplified Chinese translations, it can accommodate other language pairs. Enhance the accuracy for additional languages by adding tailored few-shot examples to few_shot_examples.json
. Note that the GPT models' performance may vary with multilingual inputs, and prompt adjustments in translate_gpt.py
might be necessary.
Contributions are more than welcome!
You can also try out this script using a Google Colab notebook. Click the link below to access the example:
Follow the instructions in the notebook to download the necessary packages and models, and to run the script on your desired YouTube video or local video file.