OpenPecha / Glyph-Cropping-AI-Automation

MIT License
0 stars 0 forks source link

OCR0043 Glyph Cropping Task AI Model #1

Open Norbu-Jamling opened 3 months ago

Norbu-Jamling commented 3 months ago

Description: This Card explains the unet model we have made to do glyph cropping task in detail.

Background: Currently our OCR team is experimenting with scripture font creation. For this we are running google OCR (OCR0021) on tibetan publication images like derge,pecing,etc. We get an image of the target character along with extra letters on the side. We give annotators these images to:

  1. crop out the target glyph
  2. draw the baseline.

Image

Since this Job follows a specific pattern of Image Segmentation, ai models can perform well for such problems even on very less data.

Diagram: Below is the diagram explaining the process in the simplest form. It shows us all the processes involved- dataset creation, training, and inference.

Image

Model: We use a simple conditional Unet model for this task. Model has 31 million parameters.

Case Study: For demonstration we trained this model on Derge dataset. We have a total of 23,000 Data of derge glyphs that have been cropped using annotators. Training on: 3k image dataset only 10 epochs (10 minutes)

we achieve satisfactory results on visual inspection of models predictions on testing data.

example outputs:

Image Image

Potential: From now on the work of glyph cropping can be easily automated to save company's resources by using this highly accurate and fast ai model.

New pipeline will look like:

Norbu-Jamling commented 3 months ago

conditional images made using tibetan monlam font using raqm with pillow, in order to have glyph stacking done correctly.

kaldan007 commented 3 months ago

Will be cleaning the code and need the code to be reviewed by @TenzinGayche