Closed nissansz closed 1 year ago
6090675 function calls (6043707 primitive calls) in 72.690 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
2892831 34.393 0.000 34.393 0.000 {floodFill}
104 10.388 0.100 46.299 0.445 template_cnjpkr_str.py:295(_check_visibility)
2284 2.987 0.001 2.987 0.001 {method 'getsize' of 'FontFamily' objects}
17726 2.218 0.000 2.218 0.000 {method 'astype' of 'numpy.ndarray' objects}
115572/77434 1.928 0.000 5.748 0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
150664 1.924 0.000 2.687 0.000 {built-in method numpy.array}
181 1.322 0.007 1.322 0.007 {method 'round' of 'numpy.ndarray' objects}
4434 0.745 0.000 0.745 0.000 {built-in method PIL._imaging.alpha_composite}
6 0.618 0.103 0.789 0.132 blending_functions.py:518(hard_light)
4118 0.557 0.000 4.356 0.001 image_util.py:32(paste_image)
286 0.534 0.002 0.534 0.002 {method 'decode' of 'ImagingDecoder' objects}
1910 0.500 0.000 0.500 0.000 {built-in method PIL._imagingft.getfont}
13686 0.490 0.000 0.491 0.000 {built-in method builtins.max}
4475 0.484 0.000 0.484 0.000 {built-in method builtins.min}
9 0.456 0.051 0.634 0.070 blending_functions.py:139(soft_light)
9 0.451 0.050 0.587 0.065 blending_functions.py:839(overlay)
1329 0.449 0.000 0.449 0.000 {method 'render' of 'FontFamily' objects}
955 0.419 0.000 0.419 0.000 {built-in method PIL._imagingft.getfamily}
141 0.413 0.003 0.413 0.003 {method 'repeat' of 'numpy.ndarray' objects}
54 0.385 0.007 1.911 0.035 arithmetic.py:264(_add_elementwise_to_uint8)
1974 0.378 0.000 0.378 0.000 {warpPerspective}
95 0.358 0.004 0.360 0.004 blending_functions.py:61(_compose_alpha)
16109 0.333 0.000 0.333 0.000 {method 'encode' of 'ImagingEncoder' objects}
54 0.324 0.006 0.324 0.006 {method 'normal' of 'numpy.random._generator.Generator' objects}
41720 0.321 0.000 0.321 0.000 {method 'reduce' of 'numpy.ufunc' objects}
4529 0.320 0.000 8.439 0.002 image_util.py:126(blend_image)
108 0.306 0.003 0.601 0.006 image_util.py:262(to_rgb)
10 0.259 0.026 0.405 0.040 blending_functions.py:786(divide)
95 0.218 0.002 0.218 0.002 ufunclike.py:127(isposinf)
43 0.214 0.005 1.001 0.023 gray.py:38(apply)
151 0.210 0.001 0.211 0.001 {method 'permutation' of 'numpy.random.mtrand.RandomState' objects}
16614 0.210 0.000 0.210 0.000 _methods.py:106(_clip_dep_invoke_with_casting)
9 0.208 0.023 0.359 0.040 blending_functions.py:306(dodge)
16 0.207 0.013 0.207 0.013 {method 'convert' of 'ImagingCore' objects}
11 0.206 0.019 0.418 0.038 blending_functions.py:412(darken_only)
5417 0.206 0.000 0.206 0.000 {method 'join' of 'bytes' objects}
955 0.184 0.000 0.184 0.000 {remap}
6 0.182 0.030 0.300 0.050 blending_functions.py:253(screen)
95 0.173 0.002 0.173 0.002 ufunclike.py:199(isneginf)
1952 0.166 0.000 0.166 0.000 {GaussianBlur}
11 0.161 0.015 0.362 0.033 blending_functions.py:359(addition)
8 0.156 0.020 0.339 0.042 blending_functions.py:465(multiply)
7 0.149 0.021 0.300 0.043 blending_functions.py:575(difference)
9 0.148 0.016 0.302 0.034 blending_functions.py:200(lighten_only)
207 0.136 0.001 8.625 0.042 switch.py:38(apply)
23 0.125 0.005 4.905 0.213 template_cnjpkr_str.py:228(
synthtiger has only ever had commits from one GitHub user, who hasn't made any GitHub contributions in 8 months, and hasn't responded to any synthtiger issues since January.
So I'm not convinced that this library is maintained - meaning that even if we were to figure out how to improve speed for it, it seems entirely possible that such a suggestion would never be merged.
Don't know whether this is pillow or cv2 function: 2892831 34.393 0.000 34.393 0.000 {floodFill}
When I turn off visibility check, time is shortened half, but it may lead to image without text visible, any fast function to check text visibility? 104 10.388 0.100 46.299 0.445 template_cnjpkr_str.py:295(_check_visibility)
The fact that it has a capital F would lead me to conclude that floodFill
is cv2 - https://docs.opencv.org/3.4/d7/d1b/group__imgproc__misc.html#gaf1f55a048f8a45bc3383586e80b1f0d0
Would you be able to post a copy of your code?
To connect the two issues, https://github.com/clovaai/synthtiger/issues/62 has been created as a synthtiger issue to ask their community about a speed increase.
Would you be able to post a copy of your code?
template_cnjpkr_str code as follows.
"""
SynthTIGER
Copyright (c) 2021-present NAVER Corp.
MIT license
"""
import os
import cv2
import numpy as np
from PIL import Image
from synthtiger import components, layers, templates, utils
BLEND_MODES = [
"normal",
"multiply",
"screen",
"overlay",
"hard_light",
"soft_light",
"dodge",
"divide",
"addition",
"difference",
"darken_only",
"lighten_only",
]
class SynthTiger(templates.Template):
def __init__(self, config=None):
if config is None:
config = {}
self.coord_output = config.get("coord_output", True)
self.mask_output = config.get("mask_output", True)
self.glyph_coord_output = config.get("glyph_coord_output", True)
self.glyph_mask_output = config.get("glyph_mask_output", True)
self.vertical = config.get("vertical", False)
self.quality = config.get("quality", [95, 95])
# self.visibility_check = config.get("visibility_check", False)
self.visibility_check = False
self.midground = config.get("midground", 0)
self.midground_offset = components.Translate(
**config.get("midground_offset", {})
)
self.foreground_mask_pad = config.get("foreground_mask_pad", 0)
self.corpus = components.Selector(
[
components.LengthAugmentableCorpus(),
components.CharAugmentableCorpus(),
],
**config.get("corpus", {}),
)
self.font = components.BaseFont(**config.get("font", {}))
self.texture = components.Switch(
components.BaseTexture(), **config.get("texture", {})
)
self.colormap2 = components.GrayMap(**config.get("colormap2", {}))
self.colormap3 = components.GrayMap(**config.get("colormap3", {}))
self.color = components.Gray(**config.get("color", {}))
self.shape = components.Switch(
components.Selector(
[components.ElasticDistortion(), components.ElasticDistortion()]
),
**config.get("shape", {}),
)
self.layout = components.Selector(
[components.FlowLayout(), components.CurveLayout()],
**config.get("layout", {}),
)
self.style = components.Switch(
components.Selector(
[
components.TextBorder(),
components.TextShadow(),
components.TextExtrusion(),
]
),
**config.get("style", {}),
)
self.transform = components.Switch(
components.Selector(
[
components.Perspective(),
components.Perspective(),
components.Trapezoidate(),
components.Trapezoidate(),
components.Skew(),
components.Skew(),
components.Rotate(),
]
),
**config.get("transform", {}),
)
self.fit = components.Fit()
self.pad = components.Switch(components.Pad(), **config.get("pad", {}))
self.postprocess = components.Iterator(
[
components.Switch(components.AdditiveGaussianNoise()),
components.Switch(components.GaussianBlur()),
components.Switch(components.Resample()),
components.Switch(components.MedianBlur()),
],
**config.get("postprocess", {}),
)
def generate(self, line):
quality = np.random.randint(self.quality[0], self.quality[1] + 1)
midground = np.random.rand() < self.midground
fg_color, fg_style, mg_color, mg_style, bg_color = self._generate_color()
fg_image, label, bboxes, glyph_fg_image, glyph_bboxes = self._generate_text(
fg_color, fg_style, line
)
bg_image = self._generate_background(fg_image.shape[:2][::-1], bg_color)
if midground:
mg_image, _, _, _, _ = self._generate_text(mg_color, mg_style, line)
mg_image = self._erase_image(mg_image, fg_image)
bg_image = _blend_images(mg_image, bg_image, self.visibility_check)
image = _blend_images(fg_image, bg_image, self.visibility_check)
image, fg_image, glyph_fg_image = self._postprocess_images(
[image, fg_image, glyph_fg_image]
)
data = {
"image": image,
"label": label,
"quality": quality,
"mask": fg_image[..., 3],
"bboxes": bboxes,
"glyph_mask": glyph_fg_image[..., 3],
"glyph_bboxes": glyph_bboxes,
}
return data
def init_save(self, root):
os.makedirs(root, exist_ok=True)
gt_path = os.path.join(root, "gt.txt")
coords_path = os.path.join(root, "coords.txt")
glyph_coords_path = os.path.join(root, "glyph_coords.txt")
self.gt_file = open(gt_path, "w", encoding="utf-8")
if self.coord_output:
self.coords_file = open(coords_path, "w", encoding="utf-8")
if self.glyph_coord_output:
self.glyph_coords_file = open(glyph_coords_path, "w", encoding="utf-8")
def save(self, root, data, idx):
image = data["image"]
label = data["label"]
quality = data["quality"]
mask = data["mask"]
bboxes = data["bboxes"]
glyph_mask = data["glyph_mask"]
glyph_bboxes = data["glyph_bboxes"]
image = Image.fromarray(image[..., :3].astype(np.uint8))
mask = Image.fromarray(mask.astype(np.uint8))
glyph_mask = Image.fromarray(glyph_mask.astype(np.uint8))
coords = [[x, y, x + w, y + h] for x, y, w, h in bboxes]
coords = "\t".join([",".join(map(str, map(int, coord))) for coord in coords])
glyph_coords = [[x, y, x + w, y + h] for x, y, w, h in glyph_bboxes]
glyph_coords = "\t".join(
[",".join(map(str, map(int, coord))) for coord in glyph_coords]
)
shard = str(idx // 10000)
image_key = os.path.join("images", shard, f"{idx}.jpg")
mask_key = os.path.join("masks", shard, f"{idx}.png")
glyph_mask_key = os.path.join("glyph_masks", shard, f"{idx}.png")
image_path = os.path.join(root, image_key)
mask_path = os.path.join(root, mask_key)
glyph_mask_path = os.path.join(root, glyph_mask_key)
os.makedirs(os.path.dirname(image_path), exist_ok=True)
image.save(image_path, quality=quality)
if self.mask_output:
os.makedirs(os.path.dirname(mask_path), exist_ok=True)
mask.save(mask_path)
if self.glyph_mask_output:
os.makedirs(os.path.dirname(glyph_mask_path), exist_ok=True)
glyph_mask.save(glyph_mask_path)
self.gt_file.write(f"{image_key}\t{label}\n")
if self.coord_output:
self.coords_file.write(f"{image_key}\t{coords}\n")
if self.glyph_coord_output:
self.glyph_coords_file.write(f"{image_key}\t{glyph_coords}\n")
def end_save(self, root):
self.gt_file.close()
if self.coord_output:
self.coords_file.close()
if self.glyph_coord_output:
self.glyph_coords_file.close()
def _generate_color(self):
mg_color = self.color.sample()
fg_style = self.style.sample()
mg_style = self.style.sample()
if fg_style["state"]:
fg_color, bg_color, style_color = self.colormap3.sample()
fg_style["meta"]["meta"]["rgb"] = style_color["rgb"]
else:
fg_color, bg_color = self.colormap2.sample()
return fg_color, fg_style, mg_color, mg_style, bg_color
# def _generate_text(self, color, style):
def _generate_text(self, color, style, line):
# label = self.corpus.data(self.corpus.sample())
# label = r'我们大家二ホンてする사이드바 we are. 、'
label = line
# for script using diacritic, ligature and RTL
chars = utils.split_text(label, reorder=True)
text = "".join(chars)
font = self.font.sample({"text": text, "vertical": self.vertical})
char_layers = [layers.TextLayer(char, **font) for char in chars]
self.shape.apply(char_layers)
self.layout.apply(char_layers, {"meta": {"vertical": self.vertical}})
char_glyph_layers = [char_layer.copy() for char_layer in char_layers]
text_layer = layers.Group(char_layers).merge()
text_glyph_layer = text_layer.copy()
transform = self.transform.sample()
self.color.apply([text_layer, text_glyph_layer], color)
self.texture.apply([text_layer, text_glyph_layer])
self.style.apply([text_layer, *char_layers], style)
self.transform.apply(
[text_layer, text_glyph_layer, *char_layers, *char_glyph_layers], transform
)
self.fit.apply([text_layer, text_glyph_layer, *char_layers, *char_glyph_layers])
self.pad.apply([text_layer])
for char_layer in char_layers:
char_layer.topleft -= text_layer.topleft
for char_glyph_layer in char_glyph_layers:
char_glyph_layer.topleft -= text_layer.topleft
out = text_layer.output()
bboxes = [char_layer.bbox for char_layer in char_layers]
glyph_out = text_glyph_layer.output(bbox=text_layer.bbox)
glyph_bboxes = [char_glyph_layer.bbox for char_glyph_layer in char_glyph_layers]
return out, label, bboxes, glyph_out, glyph_bboxes
def _generate_background(self, size, color):
layer = layers.RectLayer(size)
self.color.apply([layer], color)
self.texture.apply([layer])
out = layer.output()
return out
def _erase_image(self, image, mask):
mask = _create_poly_mask(mask, self.foreground_mask_pad)
mask_layer = layers.Layer(mask)
image_layer = layers.Layer(image)
image_layer.bbox = mask_layer.bbox
self.midground_offset.apply([image_layer])
out = image_layer.erase(mask_layer).output(bbox=mask_layer.bbox)
return out
def _postprocess_images(self, images):
image_layers = [layers.Layer(image) for image in images]
self.postprocess.apply(image_layers)
outs = [image_layer.output() for image_layer in image_layers]
return outs
def _blend_images(src, dst, visibility_check=False):
blend_modes = np.random.permutation(BLEND_MODES)
for blend_mode in blend_modes:
out = utils.blend_image(src, dst, mode=blend_mode)
if not visibility_check or _check_visibility(out, src[..., 3]):
break
else:
raise RuntimeError("Text is not visible")
return out
def _check_visibility(image, mask):
gray = utils.to_gray(image[..., :3]).astype(np.uint8)
mask = mask.astype(np.uint8)
height, width = mask.shape
peak = (mask > 127).astype(np.uint8)
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (3, 3))
bound = (mask > 0).astype(np.uint8)
bound = cv2.dilate(bound, kernel, iterations=1)
visit = bound.copy()
visit ^= 1
visit = np.pad(visit, 1, constant_values=1)
border = bound.copy()
border[mask > 0] = 0
flag = 4 | cv2.FLOODFILL_FIXED_RANGE | cv2.FLOODFILL_MASK_ONLY
for y in range(height):
for x in range(width):
if peak[y][x]:
cv2.floodFill(gray, visit, (x, y), 1, 16, 16, flag)
visit = visit[1:-1, 1:-1]
count = np.sum(visit & border)
total = np.sum(border)
return total > 0 and count <= total * 0.1
def _create_poly_mask(image, pad=0):
height, width = image.shape[:2]
alpha = image[..., 3].astype(np.uint8)
mask = np.zeros((height, width), dtype=np.float32)
cts, _ = cv2.findContours(alpha, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cts = sorted(cts, key=lambda ct: sum(cv2.boundingRect(ct)[:2]))
if len(cts) == 1:
hull = cv2.convexHull(cts[0])
cv2.fillConvexPoly(mask, hull, 255)
for idx in range(len(cts) - 1):
pts = np.concatenate((cts[idx], cts[idx + 1]), axis=0)
hull = cv2.convexHull(pts)
cv2.fillConvexPoly(mask, hull, 255)
mask = utils.dilate_image(mask, pad)
out = utils.create_image((width, height))
out[..., 3] = mask
return out
The code you've pasted is a slightly modified version of https://github.com/clovaai/synthtiger/blob/master/examples/synthtiger/template.py
In that code, Pillow is only used in save()
. The rest of the time, it is synthtiger, cv2 and a bit of NumPy.
I think this is primarily a question for synthtiger. If you think floodFill
is the main problem, then you can ask the cv2 community instead.
Thank you.
The profile shows 2892831 times of floodfill calls, too many times, not sure why synthtiger needs so many calls.
I tested performance for single line and single image generation, 20 lines, result as before, pure pillow, about 2s synthtiger: 40s
I wonder how to improve speed for synthtiger, not considering worker number.