OCR0043 Glyph Cropping Task AI Model

Norbu-Jamling commented 3 months ago

Description: This Card explains the unet model we have made to do glyph cropping task in detail.

Background: Currently our OCR team is experimenting with scripture font creation. For this we are running google OCR (OCR0021) on tibetan publication images like derge,pecing,etc. We get an image of the target character along with extra letters on the side. We give annotators these images to:

crop out the target glyph
draw the baseline.

Since this Job follows a specific pattern of Image Segmentation, ai models can perform well for such problems even on very less data.

Diagram: Below is the diagram explaining the process in the simplest form. It shows us all the processes involved- dataset creation, training, and inference.

Model: We use a simple conditional Unet model for this task. Model has 31 million parameters.

Case Study: For demonstration we trained this model on Derge dataset. We have a total of 23,000 Data of derge glyphs that have been cropped using annotators. Training on: 3k image dataset only 10 epochs (10 minutes)

we achieve satisfactory results on visual inspection of models predictions on testing data.

example outputs:

Potential: From now on the work of glyph cropping can be easily automated to save company's resources by using this highly accurate and fast ai model.

New pipeline will look like:

Collect 1-2k data of annotators work
train the model on this data
Use this model to crop glyphs for the rest of the dataset

Norbu-Jamling commented 3 months ago

conditional images made using tibetan monlam font using raqm with pillow, in order to have glyph stacking done correctly.

kaldan007 commented 3 months ago

Will be cleaning the code and need the code to be reviewed by @TenzinGayche

OpenPecha / Glyph-Cropping-AI-Automation

OCR0043 Glyph Cropping Task AI Model #1