dfrg / fount

Font and text related experiments
Apache License 2.0
29 stars 8 forks source link

add linux support #1

Open dvc94ch opened 2 years ago

dvc94ch commented 2 years ago

I'll look into this tomorrow.

dfrg commented 2 years ago

This library, for now, makes use of baked collections of well-known fonts to avoid hefty startup costs associating with scanning the font directories. This is tricky on Linux for obvious reasons. I did/do intend to implement an option to scan for fonts which would at least make it functional on Linux and also provide support for applications that need an accurate and comprehensive font list.

Thanks for taking a look.

dvc94ch commented 2 years ago

Not sure if caching should be compatible with fontconfig, but since the distro package manager is in charge of running fc-cache at appropriate times and fontconfig being installed on virtually every linux system, maybe we can let the distro take care of the actual font caching. I'm thinking the steps for an mvp would look like this:

  1. parse /etc/fonts/fonts.conf to determine the fontconfig cache directories
  2. parse the fontconfig cache files
  3. feed it into fount

presumably the baked collections approach works well for android/ios too as the preinstalled fonts are quite static?

jneem commented 2 years ago

Would you accept a PR for this fontconfig-cache-parsing approach?

Currently, parley is failing to build on linux because of the missing impl for Library::default. It seems like the only options are (1) in parley, make them pass in a font library or (2) implement default on linux

dfrg commented 2 years ago

I would accept a PR for this but I’m also working on a bit of a redesign for this project that would include fontconfig as a source of truth on Linux. I haven’t gotten around to that part yet but I do have backends for the new API working on Windows and macOS using platform font services.

Any work done on the current code would likely be useful for the new design if you get around to it before I do.

jneem commented 2 years ago

Just for checking feasibility, I spent a little while figuring out the fontconfig cache format. It isn't super well documented, but parsing it also isn't too terrible: about 500 lines later and I can produce output like this:

object type Family
string value: "Noto Kufi Arabic"
string value: "Noto Kufi Arabic Black"
object type FamilyLang
string value: "en"
string value: "en"
object type Style
string value: "Black"
string value: "Regular"
object type StyleLang
string value: "en"
string value: "en"
object type FullName
string value: "Noto Kufi Arabic Black"
object type FullNameLang
string value: "en"
object type Slant
val Int(0)
object type Weight
val Float(210.0)
object type Width
val Float(100.0)
object type Foundry
string value: "GOOG"
object type File
string value: "/nix/store/1yijsccm5f72kys7kbpr558wwv9v1y7n-noto-fonts-extra-2020-01-23/share/fonts/truetype/noto/NotoKufiArabic-Black.ttf"
object type Index
val Int(0)
object type Outline
val Bool(1)
object type Scalable
val Bool(1)

Is this a reasonable path forward, do you think? Other possibilities include linking fontconfig or just running its CLI tools and parsing the output.

dfrg commented 2 years ago

That looks really promising! If we can skip the fontconfig dependency but still make use of the cache files, I’d consider that a big win.

The primary object of interest is the charset associated with the font. I’m not sure how difficult that is to parse.

I’m not loving the idea of forking a process from a library, but linking fontconfig would be reasonable, even if not ideal.

jneem commented 1 year ago

The charset isn't too hard; I think I basically have it working. Is that all you need? I didn't easily see how to map from the charset to the script, which seems to be the hardest part when constructing a ScannedCollectionData

There's also a possibility of getting an FcLangSet, which seems somewhat more difficult because it relies on various data tables that fontconfig generates at compile time...

dfrg commented 1 year ago

I’m essentially working backwards. I have a mapping from script to some small sample text and this is how I’m querying fonts on mac and windows. I’m hoping to do the same with FcCharSet. This probably isn’t super fast on any platform but my intent is to aggressively cache these results at a higher level.

jneem commented 1 year ago

Ah, that will certainly make parsing the fontconfig cache easier. I'm a bit confused about how to populate the library though: it seems like I need to build a FallbackData, which requires calling FontRef::writing_systems, which opens the font data and reads some tables. So unless I can extract this script information from the cache instead, doesn't it require opening all the font files?

Or did you mean that the new, not-yet-pushed path on mac/windows uses charsets instead?

dfrg commented 1 year ago

Right, the new design builds the fallback cache lazily. I’ll push some code here soon so you can take a look at what I’m working on.

jneem commented 1 year ago

Nice! Then I'll focus on cleaning up and documenting the existing parser, and I'll circle back to the fount integration when that's done.

dfrg commented 1 year ago

I finally had an opportunity to play with your parser and I’m very encouraged by the results! If I have some time tonight, I’ll hook it up to my current code and see what font list it generates for the whole script set.

I’m thinking we’ll need a way to determine search priority, particularly for Latin/Greek/Cyrillic since every font seems to include those glyphs. I’ll look into what fontconfig does for this.

jneem commented 1 year ago

Glad to hear it! I've shifted into docs/cleanup mode for now, but if you run into any missing features let me know.