OpenPecha / Glyph-Baseline-Marking-AI-Model

MIT License
0 stars 0 forks source link

OCR0044 Glyph Baseline Marking AI Model #1

Open Norbu-Jamling opened 1 month ago

Norbu-Jamling commented 1 month ago

Description: This Card explains the unet model we have made to mark glyph baseline in detail.

Background: Currently our OCR team is experimenting with scripture font creation. For this we are running google OCR (OCR0021) on tibetan publication images like derge,pecing,etc. We get an image of the target character along with extra letters on the side. We give annotators these images to:

crop out the target glyph draw the baseline.

Image

Since this Job follows a specific pattern of drawing a box on the baseline, ai models can perform well for such problems even on very less data.

Diagram: Below is the diagram explaining the process in the simplest form. It shows us all the processes involved- dataset creation, training, and inference.

Image Image

Model: We use a simple conditional Unet model for this task. Model has 31 million parameters.

Case Study: For demonstration we trained this model on Derge dataset. We have a total of 23,000 Data of derge glyphs that have their baseline marked by annotators. Training on: 3k image dataset only 50 epochs (10 minutes)

we achieve almost 100% accuracy on visual inspection of models predictions on testing data.

example outputs:

Image Image

Potential: From now on the work of glyph baseline marking can be easily automated to save company's resources by using this highly accurate and fast ai model.

New pipeline will look like:

Norbu-Jamling commented 1 month ago

For the case of derge dataset, we have the baseline as vertices of the rectangle box representing the baseline, in a json file. if data is in such form we may have to do some extra steps since model works with just images not vertices info:

  1. use the vertices info to make a red colour rectangle over the glyph image.
  2. then if we wanna train or run the ai model, we do it all on images.
  3. On inference we get our output also as an image with red region representing the baseline.
  4. then to get the info as vertices in our json, we can run a script to approximate the red region to a rectangle and get its vertices coordinates.