Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition
MIT License
3.24k stars 966 forks source link

Handwritten Text Too Small/Skewed #235

Open abdksyed opened 2 years ago

abdksyed commented 2 years ago

Hi,

I am trying to create handwritten text.

But the images being generated are way too small. For the same command for printed text the images are fine, giving same command.

# For Handwritten
trdg --input_file my_dict.txt --count 3  --name_format 2 -hw

# For Printed
trdg --input_file my_dict.txt --count 3  --name_format 2 

Printed Text:

image image image

For the same Data Handwritten Text image image image

I tried to increase the size by giving --format 64 and --width 128 for handwritten text. But suddenly the text is curly. image image image

dhea1323 commented 2 years ago

I have same issues, any updates? @Belval

riteshKumarUMass commented 1 year ago

The issue is with deriving the width of the image under data generator.py. I commented out the new width derivation and simply replaced that with the width of image which we get after applying distortion. This resolved the issue for me. Following is the code snippet

        ##################################
        # Resize image to desired format #
        ##################################

        # Horizontal text
        if orientation == 0:
            # new_width = int(
            #     distorted_img.size[0]
            #     * (float(size - vertical_margin) / float(distorted_img.size[1]))
            # )
            new_width = distorted_img.size[0]
            resized_img = distorted_img.resize(
                (new_width, size - vertical_margin), Image.Resampling.LANCZOS
            )
            resized_mask = distorted_mask.resize(
                (new_width, size - vertical_margin), Image.Resampling.NEAREST
            )
            background_width = width if width > 0 else new_width + horizontal_margin
            background_height = size
cramraj8 commented 1 year ago

@riteshKumarUMass , Thank you. This really helps!