devmaxxing / videocr-PaddleOCR

Extract hardcoded subtitles from videos using machine learning
MIT License
124 stars 16 forks source link
machine-learning ocr paddleocr paddlepaddle subtitles

videocr

Extract hardcoded (burned-in) subtitles from videos using the PaddleOCR OCR engine with Python. A Colab notebook for installing and running this library is included for convenience: Open In Colab

# example.py

from videocr import save_subtitles_to_file

if __name__ == '__main__':
    save_subtitles_to_file('example_cropped.mp4', 'example.srt', lang='ch', time_start='7:10', time_end='7:34',
     sim_threshold=80, conf_threshold=75, use_fullframe=True,
     brightness_threshold=210, similar_image_threshold=1000, frames_to_skip=1)

$ python3 example.py

example.srt:

0
00:07:10,000 --> 00:07:10,083
商城......现在没什么东西

1
00:07:10,416 --> 00:07:12,000
这边是战斗辅助系统

2
00:07:13,083 --> 00:07:14,500
要进去才能了解了

3
00:07:15,083 --> 00:07:15,916
没问题了吧

4
00:07:16,333 --> 00:07:17,166
我们准备登录

5
00:07:18,416 --> 00:07:21,083
啊对了, 登录没有服务器的选择么

6
00:07:21,333 --> 00:07:25,000
没有本游戏所有玩家, 都在个服务器内

7
00:07:25,833 --> 00:07:28,833
刺激了, 这么多玩家居然都不分流的么

8
00:07:29,500 --> 00:07:31,083
那......现在登录吗?

9
00:07:31,166 --> 00:07:32,416
好,登录吧!

Install prerequisites

Python 3.7 - 3.10

paddlepaddle or paddlepaddle-gpu See https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html

Installation

pip install git+https://github.com/oliverfei/videocr-PaddleOCR.git

Alternatively for development:

  1. Clone this repo
  2. From the root directory of this repository run python -m pip install .

Performance

The OCR process can be very slow on CPU. Running with paddlepaddle-gpu is recommended if you have a CUDA GPU.

Tips

To shorten the amount of time it takes to perform OCR on each frame, you can use the crop_x, crop_y, crop_width, crop_height params to crop out only the areas of the videos where the subtitles appear. When cropping, leave a bit of buffer space above and below the text to ensure accurate readings.

Quick Configuration Cheatsheet

More Speed More Accuracy Notes
Input Video Quality Use lower quality Use higher quality Performance impact of using higher resolution video can be reduced with cropping
frames_to_skip Higher number Lower number
brightness_threshold Higher threshold N/A A brightness threshold can help speed up the OCR process by filtering out dark frames. In certain circumstances such as when subtitles are white and against a bright background, it may also help with accuracy.

API

  1. Return subtitle string in SRT format

    get_subtitles(
        video_path: str, lang='ch', time_start='0:00', time_end='',
        conf_threshold=75, sim_threshold=80, use_fullframe=False,
        det_model_dir=None, rec_model_dir=None, use_gpu=False,
        brightness_threshold=None, similar_image_threshold=100, similar_pixel_threshold=25, frames_to_skip=1,
        crop_x=None, crop_y=None, crop_width=None, crop_height=None)
  2. Write subtitles to file_path

    save_subtitles_to_file(
        video_path: str, file_path='subtitle.srt', lang='ch', time_start='0:00', time_end='', 
        conf_threshold=75, sim_threshold=80, use_fullframe=False,
        det_model_dir=None, rec_model_dir=None, use_gpu=False,
        brightness_threshold=None, similar_image_threshold=100, similar_pixel_threshold=25, frames_to_skip=1,
        crop_x=None, crop_y=None, crop_width=None, crop_height=None)

Parameters

TODO