K-RT-Dev / VGT

Program to translate Japanese text through image recognition and GPT 3.5
MIT License
62 stars 5 forks source link
antd davinci-003 electron-packager electronjs fastapi gpt-3 japanese javascript manga manga-ocr ocr pyinstaller python reactjs translator visual-novel visual-recognition webpa webpack

Introduction

Program with a graphical interface for taking screenshots and translating Japanese text to another language found in those screenshots. The system uses Manga-OCR for detecting Japanese characters in the images, and the OpenAI API to utilize the GPT Models for translating the text.

test

There are configurations available to change the image capture shortcut and also the base prompt to use for translation with GPT. The program starts a FastAPI based server for processing images and interfacing with OpenAI. The graphical interface is built using a combination of ElectronJS and ReactJS.

A compiled .exe version (using PyInstaller and electron-packager) is provided for quick installation. The installer has an approximate file size of 220Mb. After the program is installed, it automatically downloads the model used by Manga-OCR for character detection. This model has a file size of 450Mb.

Download executable:

Currently, the program only works on Windows.

Install and execute from code:

  1. Download the repo.
  2. Install the server (backend) dependencies using: poetry shell and then poetry install
  3. Install the graphical interface dependencies (frontend) with: npm install
  4. Start project in development mode with: poetry shell and then npm run electron-dev

Build executables

This process is not yet fully automated. First, we need to compile the server using PyInstaller. After that, compile ElectronJS using electron-packager. Finally, combine the results of these two processes in a final folder that we can compress into an installer

  1. Run PyInstaller inside the backend directory: cd backend then poetry shell and thenpyinstaller mangaOcrApi.spec. This should generate a folder named "dist" with the compiled server.
  2. Build the ReactJS bundle with: npm run electron-build and then compile the ElectronJS application with: npm run package. This should generate a folder named "release-builds" with the compiled frontend.
  3. Go to "release-builds/VGT-win32-x64" and create a directory named "backend". Copy the folder located at "backend/dist" inside this directory. The "release-builds" folder should look like this:
    --release-builds
    ----backend
    ------mangaOcrApi
  4. We can compress this folder into an installer using WinRaR.

Limitations:

The system is divided into 2 parts:

  1. A graphical interface based on ElectronJs and ReactJs (Javascript). This layer allows interaction with the user and captures images from the screen.

  2. A local server based on FastAPI (Python). This performs image analysis for character extraction (OCR) and communication with online services such as the GPT API.

When ElectronJS is started, it takes care of running the server as a child process.

To-do list:

Future experiments:

Base on boilerplate:
Media:

test