Backup font for missing characters when drawing text

Markxy commented 4 years ago

Description of the request

I want to use a specific font but also draw characters which are not available in that font. Replace them with characters from another "generic" font, like Arial-Bold for example. Similarly to how web browsers work when they are missing characters from fonts - the OS fills that void from a default font.

from PIL import Image, ImageFont, ImageDraw

temp_canvas = Image.new("RGBA", (1200, 300), (255, 255, 255, 255))
draw_canvas = ImageDraw.Draw(temp_canvas, "RGBA")

font = ImageFont.truetype(r"C:\fonts\BarlowSemiCondensed-Bold.ttf", size=150)
text_string = "hello ಠಠ world"

draw_canvas.text((100, 100), text_string, fill="#000000", font=font)

temp_canvas.show()

The output is:

Proposed solution

Add an argument to ImageDraw.Draw().text() like backup_font , which would be used when the first font doesn't have the character specified in text_string . Like ಠ in this example

nulano commented 4 years ago

I don't think FreeType (the library Pillow uses to load fonts) is able to combine multiple typefaces into one. This change would be simple with basic layout (Pillow can try loading each glyph from a list of fonts until one succeeds), but Raqm (used for complex scripts) seems to require passing a single typeface to be used for the whole string. So for complex layout this would require a change in Raqm, either upstream or by including a vendored change in Pillow.

The latter could help with some of the distribution issues that have appeared in the past (the dynamic loading would be moved to FriBiDi; users would only need to install LGPL-licensed FriBiDi to enable complex text, not Raqm which is sometimes available in an outdated and unsupported version in some linux distributions). It could also potentially help #3066. Edit: #3066 is now fixed by including a vendored build of Raqm in binary wheels.

As for the API to use here, I would propose ImageFont.truetype_family(font1, font2, font3, ...) to create a compound font, which would get special handling in the C code.

LateusBetelgeuse commented 2 years ago

This can be specially helpful with emojis. Many emojis font only contains emojis, and the ones that contains regular glyph have styles that can break the image/poster paradigm that one wants to achieve. However this can be tricky because all color emoji fonts that I've tested so far requires exactly a size of 109, which would require super-sampling and then smooth resizing.

voussoir commented 2 years ago

Hi, I'm interested in this problem too.

At the moment I'm using a workaround based on this stackoverflow answer.

from fontTools.ttLib import TTFont

def has_glyph(font, glyph):
    for table in font['cmap'].tables:
        if ord(glyph) in table.cmap.keys():
            return True
    log.debug('%s does not have %s', font, glyph)
    return False

def determine_font(text):
    text = stringtools.remove_control_characters(text)
    font_options = [
        'C:\\Windows\\Fonts\\NotoSansKR-Bold.otf',
        'C:\\Windows\\Fonts\\NotoSansSC-Bold.otf',
        'C:\\Windows\\Fonts\\NotoSansJP-Bold.otf',
    ]
    for font_name in font_options:
        font = TTFont(font_name)
        if all(has_glyph(font, c) for c in text):
            return font_name
    raise Exception(f'No suitable font for {text}.')

However, this still doesn't work when none of the fonts contain all of the glyphs. I learned that Noto Arabic doesn't contain the ascii letters!

I don't know anything about how web browsers or file explorers handle font stacking, but it would be great if we could get some of that behavior by default in PIL. Anyway just thought I'd share that snippet.

khaledmsm commented 2 years ago

After searching the internet, I've got a workaround by merging font files into a single font file.

By reading merge_fonts.py, I think the core code about merging font files in python is the following (you may have to install fontTools).
from fontTools import ttLib, merge

def make_font(font_list, output_to):
    merger = merge.Merger()
    font = merger.merge(font_list)
    metrics = read_line_metrics(ttLib.TTFont(font_list[0]))
    set_line_metrics(font, metrics)
    font.save(output_to)
    font.close()

do you have actual code because this one isn't clear

bai-yi-bai commented 1 year ago

After searching the internet, I've got a workaround by merging font files into a single font file. By reading merge_fonts.py, I think the core code about merging font files in python is the following (you may have to install fontTools).
from fontTools import ttLib, merge

def make_font(font_list, output_to):
    merger = merge.Merger()
    font = merger.merge(font_list)
    metrics = read_line_metrics(ttLib.TTFont(font_list[0]))
    set_line_metrics(font, metrics)
    font.save(output_to)
    font.close()
do you have actual code because this one isn't clear

I apologize for replying to a closed issue, but I also struggled with finding a merged monospace notosans font file. Pillow's documentation could be improved by providing some hints on how to troubleshoot font issues... I may write an article on this, but I am not an expert on how fonts work, how they are stored, or how Pillow uses them, nor do I expect to invest the time now that I solved my issue.

A lot of the notosans fonts contain the minimum amount of glyphs to support a specific language. For example, NotoSansThai-Regular.ttf doesn't contain any ascii characters, such as pronunciation marks. This results in Pillow adding the 'missing character' glyph to an image. I thought this had something to do with not having libraqm installed correctly until I checked the character map utility (based on the linked issue) and discovered the glyphs weren't there.

In addition, the built-in notosans merge_fonts.py/merge_noto.py seem to be broken in their current state, resulting in a error being raised (see the bottom of this post).

Here are the steps I was able to use to successfully merge ~20 fonts together into one file:

Manually clone the nototools project: https://github.com/googlefonts/nototools.git This is a large repo, with 1,634 .ttf files.
Set up your Python environment to run nototools (venv/requirements.txt)
Move the undesired .ttf files out of the root directory. I started from scratch by moving all .ttf files to /backup_moved_fonts_from_root
Move the fonts you want to merge into the /root folder.

Create a script merge_noto_diy.py with this content:


from fontTools import ttLib, merge
from nototools.substitute_linemetrics import read_line_metrics, set_line_metrics
import os
font_list = []
for a_file in os.listdir(os.getcwd()):
if a_file.endswith('.ttf'):
    font_list.append(a_file)
    print(a_file)

def make_font(font_list, output_to): merger = merge.Merger() font = merger.merge(font_list) metrics = read_line_metrics(ttLib.TTFont(font_list[0])) set_line_metrics(font, metrics) font.save(output_to) font.close()

make_font(font_list=font_list, output_to='NotoSansCombined-Regular.ttf')


7. Run the script.

I suggest adding fonts one-by-one to make sure the process completes successfully. I mention this because I had trouble with `NotoSansGurmukhi-Regular.ttf` and `NotoSansThaana-Regular.ttf`. Python returned this error:

```AssertionError: Expected all items to be equal: [1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 2048, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000]```

I will have to scour the notofont files to see if I can find suitable replacements for these two fonts.

The only other tips I would like to share are that I was also able to combine the `JetBrainsMono-Regular.ttf` font with 19 notofonts files to produce a decent monospace font with a large coverage (I will eventually post my project to github demonstrating why I needed a font(s) with 6,000+ glyphs, check my profile), but I could not combine it with the ChineseJapaneseKorean font `sarasa-fixed-cl-regular.ttf`.

Yay295 commented 1 year ago

There's actually an issue on the Noto Fonts GitHub about this: https://github.com/notofonts/noto-fonts/issues/167

tl;dr

The technical limit is in the font format. That's one reason there isn't a single font with everything, but another reason is that it would be a very big file which would be slow to work with. And most people only need a couple of these fonts so it makes more sense to keep them separate from a design/production standpoint as well as a delivery/use standpoint.

The font format technical limit (65,535 glyphs in one file) is probably why you can't combine sarasa-fixed-cl-regular.ttf with anything.

So being able to use fallback fonts definitely seems like the better option.

bai-yi-bai commented 1 year ago

Thank you for linking me to that issue, I saw it before and knew there was a limitation on the number of glyphs in a font file. I was able to use pyftsubset and glyphhanger to reduce the size of the font files I need.

Back to the main topic, I agree having a backup font solution in Pillow would be preferable. Being able to provide a list of fonts in order of preference would be even better [highest, lowest].

Are there any proposals on how to build this functionality? I don't see any follow-up on this comment: https://github.com/python-pillow/Pillow/issues/4808#issuecomment-1013966105

Looking around in the Pillow source code, I cannot determine how the glyph/ideograph not found character U+25A1 is rendered to be the 'fallback' when a glyph doesn't exist in a given font file. For example, when ImageDraw.py textbbox tries to generate a bitmap, does it use the default font built into Pillow from ImageFont load_default to generate this glyph? This contains a font encoded in base64. I guess my question is at what point in the process could this 'fallback to U+25A1' code be expanded to perform a search (try/except) statement on each provided font file? Or alternatively, could multiple instances of the class FreeTypeFont be combined together to provide greater coverage?

nulano commented 1 year ago

Pillow does not currently handle fallback at all, it only uses FreeType's default behaviour: If a font is missing support for some Unicode code point, it is rendered using the font's "missing glyph", which is usually a rectangle or question mark.

To add fallback font support in Pillow, two things are required:

Detect when FreeType returns a "missing glyph" and figure out which font can be used instead,
Somehow decide which parts of the input string should use which font (~~AFAIK web browsers split text at word boundaries~~ Edit: at least Chromium and Firefox seem to be splitting by clusters).

This is not too difficult for basic layout, but basic layout is not very good for non-English text, where fallback fonts are most useful. However, detecting which characters are not supported with Raqm layout is more tricky because complex text layout can reorder or even completely replace the input characters.

I am not aware of anyone currently working on this, feel free to implement it and open a pull request.

owocado commented 1 year ago

Thanks for the insightful response. Though until Pillow supports this feature in future, I am temporarily using https://github.com/nathanielfernandes/imagetext-py which adds fallback fonts and wraps around Pillow, if this helps anyone in finding a temporary solution. :thumbsup:

nulano commented 1 year ago

I needed this functionality myself yesterday so I've created a proof-of-concept implementation in #6926 (I've included a few sample pictures there).

Edit: For the OP, the result is:

from PIL import Image, ImageFont, ImageDraw

temp_canvas = Image.new("RGBA", (1200, 300), (255, 255, 255, 255))
draw_canvas = ImageDraw.Draw(temp_canvas, "RGBA")

font = ImageFont.truetype(r"C:\Users\Nulano\AppData\Local\Microsoft\Windows\Fonts\BarlowSemiCondensed-Bold.ttf", size=150)
backup_font = ImageFont.truetype("Nirmala.ttf", size=150)
font_family = ImageFont.FreeTypeFontFamily(font, backup_font)

text_string = "hello ಠಠ world"

draw_canvas.text((100, 100), text_string, fill="#000000", font=font_family)

temp_canvas.show()
temp_canvas.save("E:\\4808.png")

4808

nissansz commented 1 year ago

@nulano How to install this module?

AttributeError: module 'PIL.ImageFont' has no attribute 'FreeTypeFontFamily'

radarhere commented 1 year ago

An answer to the above question of how to install the proof-of-concept can be found at https://github.com/python-pillow/Pillow/pull/6926#issuecomment-1637151405

pengzhendong commented 1 year ago

@nulano How to install this module?

AttributeError: module 'PIL.ImageFont' has no attribute 'FreeTypeFontFamily'

The PR is not merged yet.

TheWalkingSea commented 7 months ago

I solved this using masks

from PIL import Image, ImageFont, ImageDraw

def getEmojiMask(font: ImageFont, emoji: str, size: tuple[int, int]) -> Image:
    """ Makes an image with an emoji using AppleColorEmoji.ttf, this can then be pasted onto the image to show emojis

    Parameter:
    (ImageFont)font: The font with the emojis (AppleColorEmoji.ttf); Passed in so font is only loaded once
    (str)emoji: The unicoded emoji
    (tuple[int, int])size: The size of the mask

    Returns:
    (Image): A transparent image with the emoji

    """

    mask = Image.new("RGBA", (160, 160), color=(255, 255, 255, 0))
    draw = ImageDraw.Draw(mask)
    draw.text((0, 0), emoji, font=font, embedded_color=True)
    mask = mask.resize(size)

    return mask

def getDimensions(draw: ImageDraw, text: str, font: ImageFont) -> tuple[int, int]:
    """ Gets the size of text using the font

    Parameters:
    (ImageDraw): The draw object of the image
    (str)text: The text you are getting the size of
    (ImageFont)font: The font being used in drawing the text

    Returns:
    (tuple[int, int]): The width and height of the text

    """
    left, top, right, bottom = draw.multiline_textbbox((0, 0), text, font=font)
    return (right-left), (bottom-top)

def addEmojis():
    # Now add any emojis that weren't embedded correctly
    modifiedResponseL = modifiedResponse.split("\n")
    for i, line in enumerate(modifiedResponseL):
        for j, char in enumerate(line):
            if (not char.isascii()):

                # Get the height of the text ABOVE the emoji in modifiedResponse
                aboveText = "\n".join(modifiedResponseL[:i])
                _, aboveTextHeight = getDimensions(draw, aboveText, poppinsFont)

                # The height that we paste at is aboveTextHeight + (marginHeight+PADDING) + (Some error)
                # (marginHeight+PADDING) is where we pasted the entire paragraph
                y = aboveTextHeight + (marginHeight+PADDING) + 5

                # Get the length of the text on the line up to the emoji
                beforeLength, _ = getDimensions(draw, line[:j], poppinsFont)

                # The x position is beforeLength + 75; 75px is where we pasted the entire paragraph
                x = (75) + beforeLength

                # Create the mask
                emojiMask = getEmojiMask(emojiFont, char, (30, 30))

                # Paste the mask onto the image
                img.paste(emojiMask, (int(x), int(y)), emojiMask)

def addEmojis(img: Image, text: str, box: tuple[int, int], font: ImageFont, emojiFont: ImageFont) -> None:
    """ Adds emojis to the text

    Parameters:
    (Image)img: The image to paste the emojis onto
    (tuple[int, int])box: The (x,y) pair where the textbox is placed
    (ImageFont)font: The font of the text
    (ImageFont)emojiFont: The emoji's font

    """
    draw = ImageDraw.Draw(img)
    width, height = box
    # Now add any emojis that weren't embedded correctly
    text_lines = text.split("\n")
    for i, line in enumerate(text_lines):
        for j, char in enumerate(line):
            if (not char.isascii()):

                # Get the height of the text ABOVE the emoji in modifiedResponse
                aboveText = "\n".join(text_lines[:i])
                _, aboveTextHeight = getDimensions(draw, aboveText, font)

                # The height that we paste at is aboveTextHeight + height + (Some error)
                y = aboveTextHeight + height + 5

                # Get the length of the text on the line up to the emoji
                beforeLength, _ = getDimensions(draw, line[:j], font)

                # The x position is beforeLength + width
                x = width + beforeLength

                # Create the mask; You might want to adjust the size parameter
                emojiMask = getEmojiMask(emojiFont, char, (30, 30))

                # Paste the mask onto the image
                img.paste(emojiMask, (int(x), int(y)), emojiMask)

The code above adds the emojis to the screen which you can copy + paste

To use it:


img = Image.new("RGB", (200, 200), (255, 255, 255))

font = ImageFont.truetype("./fonts/Poppins-Regular.ttf", 25)

# Ref: https://github.com/samuelngs/apple-emoji-linux/releases
emojiFont = ImageFont.truetype(r"fonts\AppleColorEmoji.ttf", 137)

draw = ImageDraw.Draw(img)
draw.text((0, 0), "Hello \U0001f4a4", fill=(0, 0, 0), font=font)

addEmojis(img,  "Hello \U0001f4a4", (0, 0), font, emojiFont)
img.show()

TrueMyst commented 7 months ago

Hey @Markxy @nulano

I've worked on something similar that fixes this issue. You can check it out here. It's efficient and works really well out of the box.

I don't really like imagetext-py, since it cannot handle other languages that well. You can correct me if I'm wrong. The one I made can easily be used with Pillow.

Due to the lack of features, I made this tool for my project.

Though my tool contains bugs, I do intend to fix them as soon as possible.

Let me know your feedback! Cheers ❤️

TrueMyst commented 7 months ago

@aclark4life @Markxy @nulano

It seems like I've found an easier way to fix this issue. Right now I'm using a language model which really complicates it. I'll let you know if it works!

Cheers ❤️

TrueMyst commented 7 months ago

@aclark4life @nulano @Markxy Update time!

Hey everyone, I just want to let you know that I did end up finding a good solution that doesn't uses a language model.

This time, I'm using fontTools.

Here is how it works, in the writing.py the load_fonts function loads font files specified by their paths into memory, storing them as font objects in a dictionary.

Next, the has_glyph function checks if a given font contains a glyph for a specified character.

Then, the merge_chunks function optimizes font lookup by merging consecutive characters with the same font into clusters, with the help of has_glyph function. Finally, the draw_text_v2 function utilizes these fonts to draw text on an image.

I've updated the name of the functions, so that they don't conflict with Pillow's one.

If we talk about the time it takes to render text on the image then here you go.

Current Solution:

Previous Solution (If you use the entire language model):

They both give out the same result.

The code looks fairly simply, and heavily inspired from @nulano's proof of concept.

from PIL import Image, ImageDraw
from fontfallback import writing

text_0 = """
My time - Bo en
おやすみ おやすみ
Close your, eyes and you'll leave this dream
おやすみ おやすみ
I know that it's hard to do
"""

text_2 = """
English Text: That's amazing
Arabic Text: هذا مذهل
Korean Text: 그 놀라운
Chinese Simplified: 太棒了
Japanese: すごいですね
"""

fonts = writing.load_fonts(
    "./fonts/Oswald/Oswald-Regular.ttf",
    "./fonts/NotoSansJP/NotoSansJP-Regular.ttf",
    "./fonts/NotoSansKR/NotoSansKR-Regular.ttf",
    "./fonts/NotoSansSC/NotoSansSC-Regular.ttf",
    "./fonts/NotoSansArabic/NotoSansArabic-Regular.ttf",
)

image = Image.new("RGB", (500, 350), color=(255, 255, 255))
draw = ImageDraw.Draw(image)

writing.draw_multiline_text_v2(draw, (40, 10), text_0, (0, 0, 0), fonts, 20)
writing.draw_multiline_text_v2(draw, (40, 150), text_1, (0, 0, 0), fonts, 20)

image.show()

I tried to optimize it as much I can, but if you have any good suggestions to make let me know. I hope you're happy with the results, you can check it out here. PillowFontFallBack

Cheers ❤️

nissansz commented 7 months ago

which version pillow to use for above script?

TrueMyst commented 7 months ago

which version pillow to use for above script?

Latest Release :))

nissansz commented 7 months ago

dev. version pillow? or any version?

TrueMyst commented 7 months ago

dev. version pillow? or any version?

the one on pypi, that'll work :))

TrueMyst commented 7 months ago

@aclark4life @nissansz Does it work properly?

nissansz commented 7 months ago

I still use pillow dev 10.4

TrueMyst commented 7 months ago

I still use pillow dev 10.4

It doesn't matter, the script can be run separately

nulano commented 7 months ago

@TrueMyst I haven't tried it, but I'm not sure if your approach will work with composed glyphs. For example, country flag emoji are composed of two unicode code points which render as a single glyph. Also, if you try to make it compatible with the current ImageDraw API (by implementing an object with a getmask2 method), I expect you'll run into the same issue that ultimately made me stop working on it - limitations in the current line spacing calculation caused by the current API.

TrueMyst commented 6 months ago

@nulano

I've somewhat fixed things. My code definitely supports multiple languages. I think the most ideal solution would be to get an image for the emoji from the internet based on the Unicode using emojipedia. Some emoji fonts don't have great support for emoji. Not the best, I would say, but if you're interested, we can get it up and running. A little bit of help and optimization could make it work.

TheWalkingSea commented 6 months ago

I solved this using masks


from PIL import Image, ImageFont, ImageDraw

def getEmojiMask(font: ImageFont, emoji: str, size: tuple[int, int]) -> Image:

    """ Makes an image with an emoji using AppleColorEmoji.ttf, this can then be pasted onto the image to show emojis

    Parameter:

    (ImageFont)font: The font with the emojis (AppleColorEmoji.ttf); Passed in so font is only loaded once

    (str)emoji: The unicoded emoji

    (tuple[int, int])size: The size of the mask

    Returns:

    (Image): A transparent image with the emoji

    """

    mask = Image.new("RGBA", (160, 160), color=(255, 255, 255, 0))

    draw = ImageDraw.Draw(mask)

    draw.text((0, 0), emoji, font=font, embedded_color=True)

    mask = mask.resize(size)

    return mask

def getDimensions(draw: ImageDraw, text: str, font: ImageFont) -> tuple[int, int]:

    """ Gets the size of text using the font

    Parameters:

    (ImageDraw): The draw object of the image

    (str)text: The text you are getting the size of

    (ImageFont)font: The font being used in drawing the text

    Returns:

    (tuple[int, int]): The width and height of the text

    """

    left, top, right, bottom = draw.multiline_textbbox((0, 0), text, font=font)

    return (right-left), (bottom-top)

def addEmojis(img: Image, text: str, box: tuple[int, int], font: ImageFont, emojiFont: ImageFont) -> None:

    """ Adds emojis to the text

    Parameters:

    (Image)img: The image to paste the emojis onto

    (tuple[int, int])box: The (x,y) pair where the textbox is placed

    (ImageFont)font: The font of the text

    (ImageFont)emojiFont: The emoji's font

    """

    draw = ImageDraw.Draw(img)

    width, height = box

    # Now add any emojis that weren't embedded correctly

    text_lines = text.split("\n")

    for i, line in enumerate(text_lines):

        for j, char in enumerate(line):

            if (not char.isascii()):

                # Get the height of the text ABOVE the emoji in modifiedResponse

                aboveText = "\n".join(text_lines[:i])

                _, aboveTextHeight = getDimensions(draw, aboveText, font)

                # The height that we paste at is aboveTextHeight + height + (Some error)

                y = aboveTextHeight + height + 5

                # Get the length of the text on the line up to the emoji

                beforeLength, _ = getDimensions(draw, line[:j], font)

                # The x position is beforeLength + width

                x = width + beforeLength

                # Create the mask; You might want to adjust the size parameter

                emojiMask = getEmojiMask(emojiFont, char, (30, 30))

                # Paste the mask onto the image

                img.paste(emojiMask, (int(x), int(y)), emojiMask)

The code above adds the emojis to the screen which you can copy + paste

To use it:


img = Image.new("RGB", (200, 200), (255, 255, 255))

font = ImageFont.truetype("./fonts/Poppins-Regular.ttf", 25)

# Ref: https://github.com/samuelngs/apple-emoji-linux/releases

emojiFont = ImageFont.truetype(r"fonts\AppleColorEmoji.ttf", 137)

draw = ImageDraw.Draw(img)

draw.text((0, 0), "Hello \U0001f4a4", fill=(0, 0, 0), font=font)

addEmojis(img,  "Hello \U0001f4a4", (0, 0), font, emojiFont)

img.show()

@nulano Try this solution; It supports emojis with modifier unicode letters

TrueMyst commented 6 months ago

@TheWalkingSea there are so many problems with this code, especially with the types. You mind me fixing them?

TheWalkingSea commented 6 months ago

It works perfectly for me and the types are correct. Let me know if you have any specific issues and I'll make sure to update it.

python-pillow / Pillow