Problem with truncation of text generated on images.

Hello,

I am having a problem generating images correctly based on a file (words dictionary) and a font folder in TTF and OTF format. As you can see in the examples, the text is truncated from half down for all generated images (for some pictures the truncation is higher, for others it is lower). The problem occurs both when I use trdg from the command line:

'trdg -c 100 --input_file /training/polish_diacritics_only.txt --image_dir /training/custom_backgrounds/ --font_dir /training/fonts -t 3 -l en --output_dir /training/data_synthetic --format 64 --random_skew --blur 1 --random_blur --background 3 --skew_angle 5 --margins 20 --fit'

and also when I use a Python script to generate...:

`import os import json from trdg.generators import GeneratorFromStrings

custom_fonts_dir = '/training/fonts' backgrounds_dir = '/training/custom_backgrounds' output_dir = '/training/out_data/' custom_fonts = [os.path.join(custom_fonts_dir, file) for file in os.listdir(custom_fonts_dir) if file.endswith('.ttf') or file.endswith('.otf')]

generator = GeneratorFromStrings( strings=["żółw", "ćma", "źdźbło", "ślimak"], count=1000, fonts=custom_fonts, language="pl", size=48,
skewing_angle=5, random_skew=True, blur=1, random_blur=True, background_type=3,
image_dir=backgrounds_dir, text_color="#000000",
space_width=1.0, character_spacing=2, stroke_width=2,
)

for i, (image, label) in enumerate(generator): if image is not None: image.save(f"{outputdir}/{label}{i}.png")`

I've tried different methods (simple image generation without using additional options like background, blur, text stroke, etc. and also using different parameters) and each time the generated images were wrong. The problem appears both for words containing Polish diacritical characters and words without Polish characters. Where could be the cause of the problem? abadańczyka_21 abadańczyki_24 abadańską_33

I would be grateful for any suggestions

Thanks Michal.

Belval / TextRecognitionDataGenerator

Problem with truncation of text generated on images. #348