python-pillow / Pillow

Python Imaging Library (Fork)
https://python-pillow.org
Other
12.09k stars 2.21k forks source link

Rendering multi line text in bounding box, text bbox coordinates issue #8092

Closed HERIUN closed 3 months ago

HERIUN commented 3 months ago

What did you do?

Rendering (multi line) text in left(horizontal) center(vertical) of bounding box.

What did you expect to happen?

Textbox located at left(horizontal), center(vertical)

What actually happened?

Textbox height is wrong. I don't understand line_spacing = self._multiline_spacing(font, spacing, stroke_width) in ImageDraw.multiline_textbbox()

In def multiline_textbbox, self.multiline_spacingfunction

def _multiline_spacing(self, font, spacing, stroke_width):
        return (
            self.textbbox((0, 0), "A", font, stroke_width=stroke_width)[3]
            + stroke_width
            + spacing
        )

self.textbbox((0, 0), "A", font, stroke_width=stroke_width) is (x1,y1,x2,y2), y1 is not started from zero(In my case, 2), but using (y2+stroke_width+spacing) is not general line_spacing I know.(even "A" is not the highest character!! ex."[", "}" ). line spacing can be different line by line. I suggest

def _multiline_spacing(self, font, spacing, stroke_width):
        return (
            self.textbbox((0, 0), "[", font, stroke_width=stroke_width)[3]
            - self.textbbox((0, 0), "[", font, stroke_width=stroke_width)[1]
            + spacing
        )

What are your OS, Python and Pillow versions?

--------------------------------------------------------------------
Pillow 10.3.0
Python 3.9.5 (default, Nov 23 2021, 15:27:38)
       [GCC 9.3.0]
--------------------------------------------------------------------
Python executable is /volume1/ml_image_tr_server/bin/python3
Environment Python files loaded from /volume1/ml_image_tr_server
System Python files loaded from /usr
--------------------------------------------------------------------
Python Pillow modules loaded from /volume1/ml_image_tr_server/lib/python3.9/site-packages/PIL
Binary Pillow modules loaded from /volume1/ml_image_tr_server/lib/python3.9/site-packages/PIL
--------------------------------------------------------------------
--- PIL CORE support ok, compiled for 10.3.0
*** TKINTER support not installed
--- FREETYPE2 support ok, loaded 2.13.2
--- LITTLECMS2 support ok, loaded 2.16
--- WEBP support ok, loaded 1.3.2
--- WEBP Transparency support ok
--- WEBPMUX support ok
--- WEBP Animation support ok
--- JPEG support ok, compiled for libjpeg-turbo 3.0.2
--- OPENJPEG (JPEG2000) support ok, loaded 2.5.2
--- ZLIB (PNG/ZIP) support ok, loaded 1.2.11
--- LIBTIFF support ok, loaded 4.6.0
--- RAQM (Bidirectional Text) support ok, loaded 0.10.1, fribidi 1.0.8, harfbuzz 8.4.0
*** LIBIMAGEQUANT (Quantization method) support not installed
--- XCB (X protocol) support ok
--------------------------------------------------------------------

I ignore line spacing, and calculate line height line by line instead of max_line_h(maybe pillow does), and using anchor="lt" (not "la") and adding first y adding stroke_width

from PIL import Image, ImageDraw, ImageFont
import numpy as np

bbox_x1, bbox_y1, bbox_x2, bbox_y2 = 0,0,300,100
bbox_h = bbox_y2 - bbox_y1
text = "Tag[name]\n is \nenglish."
font_size = 20
lang = "english"
writing_direction = "ltr"
text_alignment = "left"
text_border_width=3
canvas_hw=(500, 1000)
text_color=(255, 255, 255)
text_border_color=(125, 125, 125)
font_path = "/volume1/ml_image_tr_server/fonts/Pretendard-Regular.otf" ## your font_path

canvas = Image.new(mode="RGB", size=(canvas_hw[1], canvas_hw[0]))
font = ImageFont.truetype(font=font_path, size=max(1, round(font_size)))
draw = ImageDraw.Draw(canvas)

tx1,ty1,tx2,ty2 = draw.textbbox(
            xy=(0, 0), 
            text=text,
            font=font,
            anchor=None,
            language=lang,
            stroke_width=text_border_width,
            font_size=font_size
        )

lines = text.split('\n')
l_coords = [font.getbbox(
            text = line,
            language=lang,
            stroke_width=text_border_width,
            direction = writing_direction
        ) for line in lines]
total_height = sum([t[3] - t[1] for t in l_coords])

print(f"multi_line tbox y = {ty2-ty1}, total_line_height={total_height}")

start_y = bbox_y1+(bbox_h-total_height)/2 + text_border_width
xy = (bbox_x1+text_border_width, start_y)
anchor = 'lt'

lbox_coords = [] # line box coords
for line_idx, line_text in enumerate(text.split("\n")):
    draw.text(
        xy = xy,
        text = line_text,
        fill = text_color,
        font = font,
        anchor = anchor,
        align = text_alignment,
        direction = writing_direction,
        language=lang,
        stroke_width=text_border_width,
        stroke_fill = text_border_color
    )
    lbox_x1, lbox_y1, lbox_x2, lbox_y2 = draw.textbbox(
        xy = xy,
        text = line_text,
        font = font,
        anchor = anchor,
        align=text_alignment,
        direction = writing_direction,
        language=lang,
        stroke_width=text_border_width,
        font_size=font_size
    )
    # calculate next line xy
    xy = (xy[0], xy[1]+(lbox_y2-lbox_y1))

    # draw line bbox
    draw.rectangle((lbox_x1, lbox_y1, lbox_x2, lbox_y2), outline=(255, 255, 0))
    lbox_coords.append([lbox_x1, lbox_y1, lbox_x2, lbox_y2])

lbox_coords = np.array(lbox_coords)
tbox_x1, tbox_y1 = lbox_coords[:, :2].min(axis=0)
tbox_x2, tbox_y2 = lbox_coords[:, 2:4].max(axis=0)
tbox = (tbox_x1, tbox_y1, tbox_x2, tbox_y2)
bbox = (0,0,300,100)

def move_box_a_to_center_of_box_b(A, B):
    # A와 B의 좌표 (l, t, r, b)
    lA, tA, rA, bA = A
    lB, tB, rB, bB = B

    # 박스 A의 너비와 높이
    width_A = rA - lA
    height_A = bA - tA

    # 박스 B의 중심 좌표
    center_x_B = (lB + rB) / 2
    center_y_B = (tB + bB) / 2

    # 박스 A의 새로운 좌표 (중심을 B의 중심으로 이동)
    new_lA = center_x_B - width_A / 2
    new_tA = center_y_B - height_A / 2
    new_rA = center_x_B + width_A / 2
    new_bA = center_y_B + height_A / 2

    # 새로운 A 박스의 좌표 반환
    return (new_lA, new_tA, new_rA, new_bA)
ctbox_x1, ctbox_y1, ctbox_x2, ctbox_y2 = move_box_a_to_center_of_box_b(tbox,bbox)

# draw text box and text bbox(bbox center align)
draw.rectangle((ctbox_x1, ctbox_y1, ctbox_x2, ctbox_y2), outline=(255, 0, 255))
draw.rectangle((tbox_x1, tbox_y1, tbox_x2, tbox_y2), outline=(255, 0, 0))
draw.rectangle(bbox, outline=(0,0,255))

canvas.save("result.png")

image

radarhere commented 3 months ago

(even "A" is not the highest character!! ex."[", "}" ). line spacing can be different line by line

I think the unfortunate reality is that each font can set the tallest character to be whatever it wants. Here's a font where 'A' is taller.

Screenshot 2024-06-01 at 6 34 51 PM

It's also feasible that a font doesn't have the '[' character. 'A' was chosen in #1574

Textbox height is wrong

If I understand correctly, you're not saying that the co-ordinates returned by draw.textbbox() for multiline text don't match the co-ordinates of multiline text drawn by draw.text(). Rather, draw.textbbox() and draw.text() are both incorrect for multiline text because you're saying the line height is incorrect?

I suggest

def _multiline_spacing(self, font, spacing, stroke_width):
        return (
            self.textbbox((0, 0), "[", font, stroke_width=stroke_width)[3]
            - self.textbbox((0, 0), "[", font, stroke_width=stroke_width)[1]
            + spacing
        )

So your suggestion is to remove the vertical offset that the font chooses to apply to the text. I'm not sold on the idea that we should be ignoring the font author's idea about how much vertical gap should be above characters. If you'd like to do so in your code, that is fine.

nulano commented 3 months ago

So your suggestion is to remove the vertical offset that the font chooses to apply to the text. I'm not sold on the idea that we should be ignoring the font author's idea about how much vertical gap should be above characters. If you'd like to do so in your code, that is fine.

We already ignore the font author's idea about line spacing, see #1646 / https://github.com/python-pillow/Pillow/issues/6469#issuecomment-1203036583.

My suggestion from https://github.com/python-pillow/Pillow/issues/6469#issuecomment-1203096616 is:

def _multiline_spacing(self, font, spacing, stroke_width):
    return font.font.height

I just don't know how we could make this change without breaking backwards compatibility for a lot of people.

HERIUN commented 3 months ago

@radarhere

I understand "A" can be taller than "[".

If I understand correctly, you're not saying that the co-ordinates returned by draw.textbbox() for multiline text don't match the co-ordinates of multiline text drawn by draw.text(). Rather, draw.textbbox() and draw.text() are both incorrect for multiline text because you're saying the line height is incorrect?

Sorry for confusing. I think draw.textbbox() and draw.text() are both incorrect for multiline text. because def _multiline_spacing() add stroke_width 3 times (rather 2). ie.(self.textbbox(2) + stroke_width(1))

def _multiline_spacing(self, font, spacing, stroke_width):
        return (
            self.textbbox((0, 0), "A", font, stroke_width=stroke_width)[3]
            + stroke_width
            + spacing
        )
radarhere commented 3 months ago

+ stroke_width was added #7080 for the sake of backwards compatibility. See https://github.com/python-pillow/Pillow/pull/7059#discussion_r1159008846 for further info.

HERIUN commented 3 months ago

@nulano

I don't understand "backwards compatibility"

I suggest 2 method. (but only font.font.height is always tallest character's height)

def _multiline_spacing(self, font, spacing, stroke_width):
    return font.font.height + 2*stroke_width + spacing

I think spacing must be out of def _multiline_spacing, rather added for line in lines loop

def _multiline_spacing(self, font, stroke_width):
    return font.font.height + 2*stroke_width

Why i don't want use self.textbbox((0, 0), "A", font, stroke_width=stroke_width)[3], self.textbbox((0, 0), "A", font, stroke_width=stroke_width)[0] can be minus value width stroke_width...

def _multiline_spacing(self, font, spacing, stroke_width):
    return font.font.height + 2*stroke_width
radarhere commented 3 months ago

I don't understand "backwards compatibility"

Pillow would like to continue working as expected when users upgrade to newer versions. If text changes position, this would break expectations.

Alternatively, we could

Neither of these are very elegant though.

nulano commented 3 months ago

Sorry for confusing. I think draw.textbbox() and draw.text() are both incorrect for multiline text. because def _multiline_spacing() add stroke_width 3 times (rather 2). ie.(self.textbbox(2) + stroke_width(1))

No, textbbox returns font.getbbox with an offset ((0,0) in this case): https://github.com/python-pillow/Pillow/blob/95a69ec6989deda8deb55b6fe842205327de9ca6/src/PIL/ImageDraw.py#L753-L756 And font.getbbox only adds stroke_width once: https://github.com/python-pillow/Pillow/blob/95a69ec6989deda8deb55b6fe842205327de9ca6/src/PIL/ImageFont.py#L402-L404 Removing all except the last return value, we are left with:

top = offset[1] - stroke_width 
height = size[1] + 2 * stroke_width 
return ..., top + height 

Inline the variables:

return ..., (offset[1] - stroke_width) + (size[1] + 2 * stroke_width)

Rearrange:

return ..., (offset[1] + size[1]) + (2 * stroke_width - stroke_width)

Simplify:

return ..., (offset[1] + size[1] + stroke_width)

but only font.font.height is always tallest character's height

No, font.font.height is the font designer's intended line spacing, it has nothing to do with the height of the glyphs.


keep the current behaviour as the default, but add a setting to change the text position

Yeah, that is what I've been leaning towards, using the version from https://github.com/python-pillow/Pillow/issues/8092#issuecomment-2143366644 (but perhaps as a font object member function, since we'll need a different API if we ever want to finish #6926), but I suspect the API might get quite ugly for this.

HERIUN commented 3 months ago

Appreciate for explaination. I understand "backwards compatibility", "font.font.height is not height of the glyphs", "font.getbbox() add stroke_width once"

But, if stroke_width > offset[1]. top coords of font.getbbox() can be minus value. so, draw.textbbox() left,top can be minus value. and finally def _multiline_spacing() is not what we

def _multiline_spacing(self, font, spacing, stroke_width):
        return (
            self.textbbox((0, 0), "A", font, stroke_width=stroke_width)[3]
            + stroke_width
            + spacing
        )

== offset[1] + size[1] + 2*stroke_width + spacing

It doesn't mean line_spacing to me,

I expected, def _multiline_spacing() output is

def _multiline_spacing(self, font, stroke_width, spacing):
    size, offset = font.font.getsize('A')
    return (
        max(offset[1], stroke_width) # upside 
        + size[1]
        + stroke_width # downside
        + spacing

or consider font.font.height

def _multiline_spacing(self, font, stroke_width, spacing):
    size, offset = font.font.getsize('A')
    return (
        max(font.font.height,
          max(offset[1], stroke_width) # upside 
          + size[1]
          + stroke_width # downside
          + spacing
        )
    )
nulano commented 3 months ago

Have a look at this diagram: https://pillow.readthedocs.io/en/stable/handbook/text-anchors.html#quick-reference

We currently calculate the line spacing as the sum of:

It is not a good idea to use the max function as it is not predictable here, instead I think you meant the following:

def _multiline_spacing(self, font, stroke_width, spacing):
    size, offset = font.font.getsize('A')
    return (
        offset[1] + stroke_width # top
        + size[1] + stroke_width # bottom
        + spacing

This would be similar to what we currently use, but instead of measuring the letter 'A' from the ascender line, it would be measuring it from the top line (specific to the letter 'A'), so even less predictable than currently.

The value font.font.height is equal to the sum of:

I believe this would be better than what we use since:

However, as stated above, we can't easily make this change without breaking backwards compatibility.

radarhere commented 3 months ago

Shall I close this and let the problem be followed in #1646 then?