gravitystorm / openstreetmap-carto

A general-purpose OpenStreetMap mapnik style, in CartoCSS
Other
1.51k stars 811 forks source link

Use notofonts.github.io for newer font releases #4893

Open mapmeld opened 8 months ago

mapmeld commented 8 months ago

Following up on font sourcing #4887 and the font issue which I reported #4152 and re-opened recently

Most of the fonts come from repo https://github.com/notofonts/noto-fonts which was archived in January 2023, and the font files were last updated in June 2022

Google Noto's landing page https://notofonts.github.io links go to https://cdn.jsdelivr.net ; unfortunately these links fail for larger fonts (Bengali was the first alphabetically)

Where we can find fonts: each script has a subfolder (https://notofonts.github.io/bengali/) with working download URLs; these follow the format https://notofonts.github.io/${subfolder}/fonts/${font}/hinted/ttf/${regular}

The URLs are inconsistent for fonts which include "UI" (ArabicUI is /arabicui/ ; BengaliUI is /bengali/) and have their own logic (TaiViet under Tai-Viet, Symbols and Symbols2 going to Symbols). I've come up with a script which should keep the font name and subfolder together, downloads all fonts, and is still somewhat readable.

OriyaUI is deprecated. If there are opinions about it then we can download it from the original source https://github.com/notofonts/oriya/blob/main/sources/config-sans-oriya-ui.disabled

imagico commented 8 months ago

If you really want to suggest using notofonts.github.io (a site that is more or less the exact opposite of confidence inspiring IMO, in particular in terms of providing a long term stable interface to download fonts) i'd suggest not to hardcode arbitrarily chosen directory names there in our scripts but to source them from where they are selected - which seems to be https://github.com/notofonts/notofonts.github.io/blob/main/state.json.

Beyond that what i said in https://github.com/gravitystorm/openstreetmap-carto/issues/4864#issuecomment-1769293115 still applies here.

mapmeld commented 8 months ago

Here's a Python script to work off of that JSON file: https://gist.github.com/mapmeld/bc0166952d4a2bace6df2a127db34241 This could work but 5 languages use a UI font which isn't in the JSON

edit: workaround for these now included in script

imagico commented 8 months ago

This could work but 5 languages use a UI font which isn't in the JSON

If that is the case i apologize - i was under the impression that state.json lists all the fonts on the website.

Anyway - i don't think hardcoding deliberately arbitrary naming decisions on some obscure website that can change at any moment at the wimp of whoever is in charge of that is a very good idea.

If we really want to use that website as a source for fonts the most reliable and maintenance friendly way might be to have a script crawling the site to find the URLs of all the fonts we need that we run as needed and distributing that list of URLs to style users for download.

mapmeld commented 8 months ago

Noto split their mono-repo into individual ones. I'm open to bash or Python, github.io or an equivalent github.com URL to be closer to the current one ( ex: https://github.com/notofonts/arabic/raw/gh-pages/fonts/... for Arabic ). Or maybe a system where we download from a specific commit hash of these repos. But I think anything involving cloning the repos, searching via GitHub API, building the fonts ourselves, etc. is adding excessive layers of complexity.

imagico commented 8 months ago

But I think anything involving cloning the repos, searching via GitHub API, building the fonts ourselves, etc. is adding excessive layers of complexity.

I agree on that. My main concern here is reliability and robustness. The archived repository we currently use is fairly good in that regard, but of course this is not sustainable if we want to use later changes to the fonts. My own approach would - as indicated - probably be to crawl the website meant for downloading the fonts for the urls of the list of fonts we need. This would provide some protection against someone in control there feeling the need to scratch an itch and move the cheese around in some way so to speak. For additional redundance we could think about keeping the current source as fallback.

But i am not the only one deciding this - so lets see what views others have on the matter.

pnorman commented 8 months ago

It looks good for me, but I haven't tested it. Given its getting more complex it could use rewriting in python, but since I'm not willing to do that, I'm not going to require that as a fix to something that is currently broken.

I'm not sure if parsing the JSON file is a good idea. I would want some documentation showing that they intended this file for external consumption and isn't an artifact of their current build process or website design. Without that assurance, I would consider it just as likely for them to move the JSON file as redo their paths, making it additional complexity for no gains.

imagico commented 8 months ago

To be clear: My pointer to the JSON file was based on the mistaken assumption that this contained a complete list of font files linked to on the website - or at least all those we need. Since that is not the case it is of little value for us even as is.

mapmeld commented 5 months ago

My update to the PR is to replace get-fonts.sh with a Python script (get-fonts.py) which downloads from the current Noto builds (for example, Bengali from https://notofonts.github.io/bengali/fonts/NotoSansBengali/hinted/ttf/NotoSansBengaliUI-Regular.ttf)

There is currently an error in the Noto Armenian repo which they failed to build the NotoSansArmenian fonts - I have filed an issue there. For testing you can use Serif

hummeltech commented 5 months ago

It looks like you might be able to pull those troublesome fonts directly from GitHub.com instead of GitHub.io. I.E.: https://github.com/notofonts/notofonts.github.io/tree/main/fonts/NotoSansArmenian/hinted/ttf

mapmeld commented 5 months ago

@hummeltech That's a good idea, I've pushed a change for https://raw.githubusercontent.com/notofonts/notofonts.github.io as a fallback source for anything missing from GitHub Pages

mapmeld commented 1 month ago

Can I do something to get the font changes out of limbo? I could minimize get-fonts.py to the non-CJK non-emoji fonts, or a subset.

Update to original PR message: the script uses Google Noto's preferred cdn.jsdelivr.net which is working; notofonts.github.io serves as a backup. Due to Google stopping listing of "UI" fonts on Arabic and South Asian languages, this prints a warning then sources those fonts from notofonts.github.io. The discontinued Arabic UI and Oriya UI fonts get replaced by their standard, up-to-date fonts.