mikke89 / RmlUi

RmlUi - The HTML/CSS User Interface library evolved
https://mikke89.github.io/RmlUiDoc/
MIT License
2.73k stars 298 forks source link

Composite font families #500

Open ShawnCZek opened 1 year ago

ShawnCZek commented 1 year ago

This issue was initially supposed to address the font fallback algorithm. Instead, it has developed into (possibly) implementing composite font families. Read the discussion below to get a hold of the entire problem.


Introduction

I am creating a user interface that will be used by international communities. And to provide them with the best support, the system must be able to work with multiple fonts of the same family to cover all character sets (as no advanced font face covers them all in one file).

Consideration

The current solution is to load the regular font face as the main font while loading all the other variants as a fallback. While this is working well, fallback fonts ignore the weight and the style of the used font:

A fallback font covering different character sets but ignoring the original font style and weight

In CSS, supporting multiple character sets within one font family can be done in two ways:

I believe both solutions are complex in one way or another. However, since font-family is supposed to support multiple font families, anyway, at least according to the CSS specification, I believe it makes sense to (firstly) go with this solution for RCSS.

Implementation

There are a few things to consider for this implementation:

With all things considered, this might also be a breaking change for the API.


I am willing to take care of this and implement it. But I would like to first discuss the raised points and problems above.

I have also gone through old GitHub issues and the Gitter chat but could not find any mention of this.

mikke89 commented 1 year ago

Hey, and thanks for the well-researched issue.

I'm certainly willing to consider support for multiple font-families. However, before we consider that, what do you think about dealing with the issue you describe more directly. That is, matching the fallback font to the current font's weight and style first? I don't see any reason we couldn't be smarter about this selection, and I believe it would be a lot simpler and less invasive in terms of API-changes.

ShawnCZek commented 1 year ago

Hey @mikke89,

That is actually what I considered first! When I discovered the issue with fallback fonts ignoring the weight and style, I wanted to quickly add support for those so it covers the described use case. However, there are a few issues with this, I believe:

We could possibly work around the second issue with some magic (caching?). But I believe the first problem is blocking this.

I can understand that breaking changes to the API are always problematic, especially in a case like this. We could, therefore, try to aim for the first solution and create something like unicode-range without the at-rule (for now). So, for example, the LoadFontFace function would have a parameter for defining such a value. Or it could be determined via FreeType by looping through all characters. On the other hand, requiring the user to specify the range is a bit rough, and looping through all characters does not sound ideal, either. Mainly because I do not specialize in typography and am not sure what and how many ranges of such supported characters one font may generate.

With all this in mind, I still feel like expanding font-family is the best. Fallback fonts should probably serve only as an "automatic" sans-serif replacement and not a solution for having support for all character sets.

However, if you have a different idea in mind, possibly solving the two mentioned problems above, I will happily take a look at expanding fallback fonts. After all, as you mentioned, it would be the least painful solution.

mikke89 commented 1 year ago

I understand, you are making a good case for expanding font-family. I think that sounds quite reasonable, and I'd be happy to see some prototyping for feature.

As for API considerations. Actually, I wonder if we could get away from not even making any API changes. Perhaps we could simply submit the comma-separated list of font families to FontEngineInterface::GetFontFaceHandle. From the library's point of view, I think it makes sense to still consider a list of font-families as a single "font handle". Or is there any reason the library needs to act on the list itself? For parsing sake, it might make sense to submit a StringList, feel free to make that change, but a single font handle would still be preferable.

To me it sounds like unicode-range would be more effective as an enhancement to multiple font families, rather than a replacement. I also agree that it makes sense to tie it directly to a given font. It could be an optional argument as part of LoadFontFace. If you find this useful in your case, feel free to make that addition.

Just a small detour: In CSS, I mainly think of multiple font-families as fallbacks for when the font families are not available at all. This case doesn't really apply to us. A missing font face is considered an error, since the user should know exactly which ones are available. However, here, multiple font-families are more about fallbacks for missing glyphs - which I see CSS also uses this property for. I am mainly writing this to reflect on some differences between the requirements for our library versus CSS, and to ensure that we still consider missing font faces as an error.


I started writing the above first, while the following thoughts started developing. Maybe this is a direction we want to consider first.

In some sense, I find that usually we want to keep the same set of font families. E.g. mono fonts would always be matched with Noto Sans Mono as fallback. So in a sense, I expect we would always repeat the same set of font families, being defined by the first font in the list. This does lend itself into the idea of "custom fonts" (I guess analogous to @font-face in CSS) instead of multiple font families.

I don't see a good reason for doing this in RCSS, this might as well (and more simply) be immutably constructed with the font loading scheme. E.g.

struct FontMatch {
  String name;
  Vector<UnicodeRange> character_ranges;
  Pair<FontWeight, FontWeight> font_weight_range;
};

Rml::CreateCustomFontFace(String name, Vector<FontMatch> font_list_in_priority_order);

It's a bit different to CSS, since it here defines a list from a single font name. Maybe we want to explore this idea further?

ShawnCZek commented 1 year ago

In a way, one handle could certainly represent multiple font families. And it is indeed true that there would not be any breaking changes to the API because of it. Though, to me, FontFaceHandle sounds like a singular font face, and providing an API that accepts a list, which can be represented as an array span, sounds more straightforward. Anyway, I definitely agree with your point here, and I also favor only internal changes over breaking the API.

About font-family and its background, I think it makes sense to have it in RmlUi as well. It is only a matter of time until there is an interface to load system fonts, as one can do through the src property and the local() value at the @font-face rule. Furthermore, it works well for fallback fonts, especially if unicode-range becomes a thing.

Anyway, after further reflection and your detailed comment, including an example of such an interface, I believe character ranges and "virtual font faces" make more sense. My mind was simply too stuck at the @font-face rule, which is unnecessary for now. Although, if we go forward with these changes, such an at-rule can be easily introduced in the future.

However, there are still a few things to consider:

mikke89 commented 1 year ago

Alright, sounds good. First, I just want to emphasize that my example was mainly just to get the idea across, and should be considered very rough. The naming and exact types and parameters especially needs substantial refinement.

Then, in a sense, I am essentially trying to somewhat emulate the @font-face rule, with the FontMatch type (surely needs a better name). You will find the font-weight range and unicode range in the CSS rule as well. These were just some initial examples, we could add only those types that makes sense to us now, and possibly expand on it later. I suggest we use @font-face for inspiration, and how fonts and glyphs are selected based on that (not strictly so, but where it makes sense to us). Later on, if we find a need for it, we could possibly parse this in @font-face-rules.

Since the function takes a list here, the custom font face is in a sense equivalent to a list of fonts (or @font-face) in the font-family property in CSS.

My take on your specific questions:

  1. An error and early return sounds very reasonable to me.
  2. I agree, an empty Unicode range list should essentially mean "no constraint", i.e. all characters in the font are available. In fact, all constraints should be optional, in the same way as in @font-face.
  3. I'll refer to the CSS @font-face specs here. But please only include those that you see a need for right away. We can include more later too as we see fit.
  4. Hm, indeed, these are some design questions that I think one just needs to start prototyping out to get a sense of. I think this, and the font/glyph matching algorithms are the ones that might be some work. Possibly also need some refactoring. I hope we can avoid any substantial performance regressions for current usage.

By the way, I think it would also be reasonable to take a single FontMatch type instead of a list, and rather enable multiple fonts in font-family. That way we stick closer to the CSS meaning of @font-face and might make it easier if we want to parse that later. Maybe that familiarity is more important to us. I'll leave that for you to decide, perhaps some implementation details make the decision more clear. I am happy either way. Also, feel free to rename FontFaceHandle if you figure out a more appropriate name for it.

ShawnCZek commented 1 year ago

I like to compare things to CSS because that way, users of this library (including me) do not have to rely on custom syntax or a bit different behavior. Of course, as you have mentioned, such comparisons should be made only where it makes sense.

With that being said, I consider the concept you provided perfectly valid, including the list of FontMatch. Sure, the structure name is not perfect. But the @font-face at-rule also allows defining multiple entries with the same name:

@font-face {
    font-family: "Custom Font";
    src: url("NotoSans-Regular.ttf");
    unicode-range: U+54;
}

@font-face {
    font-family: "Custom Font";
    src: local("Comic Sans MS");
    unicode-range: U+73;
}

body {
    font-family: "Custom Font", sans-serif;
}

If you display "Test" in HTML, T is rendered as Noto Sans and s as Comic Sans MS.

With the concept you have provided, this would be translated into the following:

// To keep things simple, I assume that the character range is a pair.

Vector< FontMatch > matches( 2 );

matches[ 0 ].name = "Noto Sans";
matches[ 0 ].character_ranges = { 0x54, 0x54 };

matches[ 1 ].name = "Comic Sans MS";
matches[ 1 ].character_ranges = { 0x73, 0x73 };

Rml::CreateCustomFontFace( "Custom Font", matches );

Therefore, I believe that both the custom font face and multiple values in the font family have their place. The latter is likely easier to use as the user does not have to write any C++ code. On the other hand, if we provide such "virtual"/custom font faces, it opens the doors for future implementation of the at-rule, which also broadens the CSS standard.

In conclusion, I will try to implement creating a custom font face first, based on the interface you have provided, and will see how things go. I first have to get more familiar with the code base, though. But then I will get back to discussing the code design.

ShawnCZek commented 11 months ago

Sorry for the delay. I have not forgotten about this. However, an IME integration is a higher priority for me right now.

I will get back to this issue right after it.

mikke89 commented 11 months ago

I understand, thanks for keeping us updated.

CreatorInDeep commented 1 month ago

Hi,

I have prepared the feature for version 5.1, and here is the link to the PR

I am currently preparing the changes to be merged into the master branch. Once I am done, I will provide another link to the PR for the merge into master.

mikke89 commented 1 month ago

Wow, that's great. I'm very happy to hear that this is being worked on :)

I'll take a closer look at it once the new PR is open.

ShawnCZek commented 1 month ago

I actually got back to this a while ago (after finishing the IME pull request); however, I have sadly come to a dead end.

The pull request from @CreatorInDeep is a great effort. Unfortunately, the performance has significantly worsened. This is one of the problems with finding a (perfect) solution; the entire issue is complex.

First, I believe there must be many more changes to the library than just altering the glyph lookup or handling fallbacks. These operations are run on every single character; therefore, increasing their runtime is non-negotiable. Instead, we should develop some kind of caching for a string, generating ranges of the given text with a selected font face. This proves challenging with a fully interfaced font engine, and it would require many breaking changes, mainly because this is not just about kerning but also rendering, font decorations, and other features.

A point connected to this is the FontFaceHandle type. You suggested renaming it and probably handling it differently on the backend. Unfortunately, this is not a solution, either. In certain situations, we still want to hold on to only one font face, while in other situations, we want to work with a (composite) font family.

Implementation details are complex, too. I am unsure how to handle "sharing" font faces between font families, especially if we want to support freeing font resources. Shared pointers would be a preferable solution, but these bring a lot of controversies, possibly "memory leaks", too, when dead resources (e.g., unused font families) could be pointing at each other.


I am unsure what the solution could be. The virtual font engine interface brings a lot of mental gymnastics, as well as (possible) performance issues and the need for backward compatibility. We could consider "strictly" implementing specific parts of the font engine, basically de-virtualizing them, and potentially implementing HarfBuzz, which seems to bring a lot of improvements, at least according to the extensive sample. However, this is an overwhelming task.

I like to be inspired by Unreal Engine, which is like a book with solutions and all the resolved caveats the developers have experienced throughout the years. Their text sharpener generates sequences of the text based on the font used for each of those sequences. Their composite font, however, is rather complicated. Nonetheless, it helped me understand all the problems.

When it comes to storing font families and their font faces, I took a look at gecko. They seem to share the available font faces across font families that might use them, basically preloading all possible font faces for the font family.


I might return to this issue in the future. This comment also serves as a note for me with all the resources I have looked at. I have pushed a branch with the simple prototype of constructing a custom font family, which works more-or-less fine. The problem, as stated above, is putting this composite font family to work.