findfont picks wrong font

JuliaGraphics / FreeTypeAbstraction.jl

A Julian abstraction layer over FreeType.jl

Other

25 stars 21 forks source link

findfont picks wrong font #83

Open tecosaur opened 9 months ago

tecosaur commented 9 months ago

I was recently thinking of trying something other than a cache to speed up findfont (it takes ~1.5s on my machine), but in the process noticed that it seems to be picking the wrong font??

julia> FreeTypeAbstraction.findfont("ibm plex sans bold italic")
FTFont (family = AlegreyaSans-BoldItalic, style = Bold)

julia> ibm_psbi = FreeTypeAbstraction.try_load("/home/tec/.local/share/fonts/IBMPlexSans-BoldItalic.otf")
FTFont (family = IBM Plex Sans, style = Bold Italic)

julia> FreeTypeAbstraction.match_font(ibm_psbi, ["ibm", "plex", "sans", "bold", "italic"])
(11, 10, false, -24)

julia> alegreya_tex_sbi = FreeTypeAbstraction.try_load("/usr/share/fonts/texlive-alegreya/AlegreyaSans-BoldItalic.pfb")
FTFont (family = AlegreyaSans-BoldItalic, style = Bold)

julia> FreeTypeAbstraction.match_font(alegreya_tex_sbi, ["ibm", "plex", "sans", "bold", "italic"])
(14, 0, false, -27)

jkrumbiegel commented 9 months ago

Seems like the function I wrote there has some bad edge cases. The problem here is that the family name of AlegreyaSans-BoldItalic contains "sans" "bold" and "italic" which is a better match than just matching "ibm plex sans" in the IBM font. The algorithm prefers the font with the best family name match first, but that's probably not a good idea when I look at these names. A fix could be to add the first two numbers together and just make a joint family plus style score. Then IBM would win with 21 vs 14 points.

What I wanted at the time was that you wouldn't have to exactly spell out "AlegreyaSans-BoldItalic" because that's usually annoying to look up for every font, these exact names are not obvious. But this matching on name parts has its own issues as seen here.

tecosaur commented 9 months ago

That's exactly the same thought as I had (combining the score, and using the family as the first tiebreak).

The non-cache speed-up I had in mind is that:

Most of the time, I think we can expect an exact match (i.e. every part of the "search term list" is matched to the font)
When an exact match occurs, it's probably fine to bail out at that point
We can sort the list of font files by how closely their filename matches the search term to try to check likely-exact matches first

Might you have any thoughts on this?

jkrumbiegel commented 9 months ago

When an exact match occurs, it's probably fine to bail out at that point

That's why there's a name length penalty. This is so that some font bold is preferred vs. some font bold italic even though both have an "exact match" for some font bold.

tecosaur commented 9 months ago

I think the idea of pre-sorting + bailing out should be fine as long as we can assume that font files (even just within the same family's set of files) have a semi-consistent naming scheme.

What I'd like to test is then sorting by matching components then length, also not bailing out until the total match score decreases. I think that should behave pretty well in practice, and supply a dramatic speed-up (Makie's caching is helpful, but even the first time you hit a custom font, a multi-second delay seems a bit much).

tecosaur commented 9 months ago

Hmmm, looking at the current scheme I think I see one or two other aspects where the matching process falls short. I'll see how my efforts go, I might have a PR for discussion up shortly.