Open khawar-islam opened 2 years ago
Hi,
This engine often generates poorly images. There are 3 problems in this engine. Here are some ways to solve this problem.
Currently, this engine coverts to grayscale images and checks color differences at the border of text and background. If the color difference does not exceed a given threshold, this image is skipped.
If you adjust the threshold, it can generate better images.
Edit loDiff
, upDiff
parameter values in floodFill
function from 16
, 16
to 0
, 16
or 0
, 32
in this code.
The first solution has a limitation. Currently, this engine does not consider the color difference between text and text effect (ex2 "그릇"). You can solve the problem by adding logic to check the color difference between text and text effect.
Currently, this engine does not consider post-processing (ex1 "가?", ex3 "긍정적"). The text is not visible due to post-processing like blur, even if the color difference between text and background is large. If you add color difference checking logic after post-processing, it can genertate better images. Add the following code in this code.
if not _check_visibility(image, fg_image[..., 3]):
raise RuntimeError("Text is not visible")
I introduced some solutions, but these have not been tested. Test performance may be lower than expected.
Thanks.
@moonbings I was facing a similar issue. Solution 1 you provided has helped to some extent. Also, I tried out solution 3, but to no avail. 1 in 100 images are distorted and this will affect the training of my data for OCR application. Can you suggest a workaround? Edit: This is an example [ https://user-images.githubusercontent.com/39117677/222642021-e3d40153-ce93-4455-9422-881d91c478ea.jpg ]
the word is: દીધું!
@khawar-islam were you able to solve this issue?
@aadit2697 No, I am still facing same issue and of course it affects training
@khawar-islam you want to play around with the style and post-process in the config.yaml These numbers worked for me. Hope this helps! Do let us know if this works out for you :)
style: prob: 0.25 args: weights: [1, 2, 2] args:
- size: [1, 2]
alpha: [1, 1]
grayscale: 0
# text shadow
- distance: [1, 2]
angle: [0, 0]
alpha: [0.3, 0.7]
grayscale: 0
# text extrusion
- length: [1, 2]
angle: [0, 360]
alpha: [1, 1]
grayscale: 0
postprocess: args:
- prob: 0.0
args:
scale: [4, 8]
per_channel: 0
# gaussian blur
- prob: 0.0
args:
sigma: [0, 2]
# resample
- prob: 0.0
args:
size: [0.4, 0.4]
# median blur
- prob: 0.0
args:
k: [1, 1]
@aadit2697 thank you!
What do you think about two parameters min_length: 6, max_length: 25
? I already modified it but sometimes images are distorted. To generate a more clean dataset, if we reduce the max_length, would it be good? What is your opinion?
@aadit2697 thank you!
What do you think about two parameters
min_length: 6, max_length: 25
? I already modified it but sometimes images are distorted. To generate a more clean dataset, if we reduce the max_length, would it be good? What is your opinion?
You could try it out. Personally, for my use case, I used default values for max and min length. Although If your words are smaller than the 25-character length, you could give it a shot
Hello @moonbings During synthetic dataset generation, some image are very distorted and i don't have any idea how to fix it. I played with some parameters but it didnt work for me. Any solution?
Samples image are not clear