khawar-islam commented 2 years ago

Hello @moonbings During synthetic dataset generation, some image are very distorted and i don't have any idea how to fix it. I played with some parameters but it didnt work for me. Any solution?

Samples image are not clear

moonbings commented 2 years ago

Hi,

This engine often generates poorly images. There are 3 problems in this engine. Here are some ways to solve this problem.

1. Color difference threshold

Currently, this engine coverts to grayscale images and checks color differences at the border of text and background. If the color difference does not exceed a given threshold, this image is skipped. If you adjust the threshold, it can generate better images. Edit loDiff, upDiff parameter values in floodFill function from 16, 16 to 0, 16 or 0, 32 in this code.

https://github.com/clovaai/synthtiger/blob/c731f3293130b4c7356429fa2f2b5d17eb6a7382/examples/synthtiger/template.py#L274

2. Color difference between text and text effect

The first solution has a limitation. Currently, this engine does not consider the color difference between text and text effect (ex2 "그릇"). You can solve the problem by adding logic to check the color difference between text and text effect.

3. Post-processing

Currently, this engine does not consider post-processing (ex1 "가?", ex3 "긍정적"). The text is not visible due to post-processing like blur, even if the color difference between text and background is large. If you add color difference checking logic after post-processing, it can genertate better images. Add the following code in this code.

https://github.com/clovaai/synthtiger/blob/c731f3293130b4c7356429fa2f2b5d17eb6a7382/examples/synthtiger/template.py#L118-L119

if not _check_visibility(image, fg_image[..., 3]):
    raise RuntimeError("Text is not visible")

I introduced some solutions, but these have not been tested. Test performance may be lower than expected.

Thanks.

aadit2697 commented 1 year ago

@moonbings I was facing a similar issue. Solution 1 you provided has helped to some extent. Also, I tried out solution 3, but to no avail. 1 in 100 images are distorted and this will affect the training of my data for OCR application. Can you suggest a workaround? Edit: This is an example [ https://user-images.githubusercontent.com/39117677/222642021-e3d40153-ce93-4455-9422-881d91c478ea.jpg ]

the word is: દીધું!

@khawar-islam were you able to solve this issue?

khawar-islam commented 1 year ago

@aadit2697 No, I am still facing same issue and of course it affects training

aadit2697 commented 1 year ago

@khawar-islam you want to play around with the style and post-process in the config.yaml These numbers worked for me. Hope this helps! Do let us know if this works out for you :)

style: prob: 0.25 args: weights: [1, 2, 2] args:

text border

  - size: [1, 2]
    alpha: [1, 1]
    grayscale: 0
  # text shadow
  - distance: [1, 2]
    angle: [0, 0]
    alpha: [0.3, 0.7]
    grayscale: 0
  # text extrusion
  - length: [1, 2]
    angle: [0, 360]
    alpha: [1, 1]
    grayscale: 0

postprocess: args:

gaussian noise

- prob: 0.0
  args:
    scale: [4, 8]
    per_channel: 0
# gaussian blur
- prob: 0.0
  args:
    sigma: [0, 2]
# resample
- prob: 0.0
  args:
    size: [0.4, 0.4]
# median blur
- prob: 0.0
  args:
    k: [1, 1]

khawar-islam commented 1 year ago

@aadit2697 thank you!

What do you think about two parameters min_length: 6, max_length: 25? I already modified it but sometimes images are distorted. To generate a more clean dataset, if we reduce the max_length, would it be good? What is your opinion?

aadit2697 commented 1 year ago

@aadit2697 thank you!

What do you think about two parameters min_length: 6, max_length: 25? I already modified it but sometimes images are distorted. To generate a more clean dataset, if we reduce the max_length, would it be good? What is your opinion?

You could try it out. Personally, for my use case, I used default values for max and min length. Although If your words are smaller than the 25-character length, you could give it a shot

clovaai / synthtiger

Images are very distorted #36

1. Color difference threshold

2. Color difference between text and text effect

3. Post-processing

text border

gaussian noise