Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition
MIT License
3.15k stars 943 forks source link

Add Farsi Numbers #322

Open pourmand1376 opened 9 months ago

pourmand1376 commented 9 months ago

Currently, Farsi Numbers are not recognized by OCR.

Here is an attempt to make them recognizable.

Data is generated from this code:

import random 

persian_digits = '۰۱۲۳۴۵۶۷۸۹'
nums = []
for i in range(1000):
    num = ''
    digit_count = random.randint(1,10)
    for j in range(digit_count):
        num += random.choice(persian_digits)
    nums.append(str(num))