keymanapp / keyman

Keyman cross platform input methods system running on Android, iOS, Linux, macOS, Windows and mobile and desktop web
https://keyman.com/
Other
402 stars 112 forks source link

feat(developer): kmc-package support remote fonts and files #12667

Open mcdurdin opened 1 week ago

mcdurdin commented 1 week ago

The concept here is that the 'Name' property for a file can now be a remote reference, rather than a local file. There are two supported formats in this commit:

Future sources could be considered, e.g. noto. We don't want to allow arbitrary URLs, both for stability and for security reasons.

This change is entirely compiler-side, so we don't need to make any changes to apps, and so packages will be backwardly compatible. A lot of work will need to be done with the Package Editor in TIKE to support this feature.

Replaces: #12336 Fixes: #11236

@keymanapp-test-bot skip

keymanapp-test-bot[bot] commented 1 week ago

User Test Results

Test specification and instructions

User tests are not required

Test Artifacts

mcdurdin commented 1 week ago

Capturing discussion on font versioning. We need to think this through with FLO, caching of fonts, versioning updates.

Marc Durdin 30 minutes ago in fonts.languagetechnology.org, where is the version data sourced from in the .ttf? I see vast majority of fonts have version #.### format, but three fonts have semantic version patterns: Dukor dukor (1.0.7), suranna (1.0.5) and wakor (4.0.7). 27 replies

Marc Durdin 29 minutes ago We want to see if we need to re-download a font based on the version metadata provided, and also verify font version

Victor Gaultney 20 minutes ago Here’s a summary: https://silnrsi.github.io/FDBP/en-US/Versioning.html

Marc Durdin 20 minutes ago So looks like you are using the name table 4:07 Thanks!

Victor Gaultney 19 minutes ago Version info is in both head and name 4:08 decimal and string 4:09 The three fonts you mention are not our and are out of step with the common #.### industry pattern

Marc Durdin 17 minutes ago Yeah 4:10 I just downloaded wakor, and its name record says Version 4.007 so something seems inconsistent there

Victor Gaultney 15 minutes ago You can try to map the semantic-style to the industry standard using the M.mpp format 4:11 So 4.0.7 is really 4.007 4:12 By definition the head must be #.#

Marc Durdin 13 minutes ago But that doesn't explain where 4.0.7 is coming from in FLO data -- given the .ttf itself has 4.007 (both tables), shouldn't FLO be using that? 4:14 Suranna-Regular.ttf has Version 1.0.5; .... and head.fontRevision 0x10000 which really doesn't help.

Victor Gaultney 12 minutes ago That comes from the documentation from the source, not the ttf

Marc Durdin 12 minutes ago I see. So we cannot reliably automatically check then. That's a shame 4:16 (We are actually not going to support the fonts in question anyway as none of them have a direct URL download, so it's kinda moot, but I just want to make sure we don't paint ourselves into a corner in our use of FLO) 4:16 Wakor has fontRevision of 0x401CB which also doesn't really help! 4:17 I am keen to avoid downloading hundreds of fonts in each build of the keyboards repository, so I want to cache, but if the version metadata is not reliable, that makes it tricky

Victor Gaultney 9 minutes ago head.fontRevision is not carefully handled by font providers and can be downriught odd

Marc Durdin 8 minutes ago So name.Version is a safer bet, /^Version (\d+.\d+)/i?

Victor Gaultney 8 minutes ago That too is not handled consistently - it’s just a string whose format cannot be guaranteed 4:19 There is no ‘safe’ bet here

Marc Durdin 5 minutes ago I see. I might have to maintain a map of ttf:name.Version:FLO-version -- that would allow me to invalidate stale fonts. The other half is knowing when a font is updated in FLO data. For that, we are proposing to have a reference such as flo:abyssinicasil@2.300 in our data files (we also support other references such as GitHub raw URLs)

Victor Gaultney 2 minutes ago I think you could use the name string regex as a first step, and if the result is a #.### pattern use it. If it’s not then you can’t compare.

Marc Durdin 1 minute ago Yep, that's my thinking. And perhaps disallow use of fonts via FLO which don't meet that req. I'm okay with that (fallback is they can download the font and embed it the legacy way, or hassle the font author to up their game) 4:26 ... then if the FLO version doesn't match, we'd flag a version update for the containing keyboard package. And that way, we can (semi-)automatically update packages which have updates to fonts (which will make David and Lorna very happy). Hard part is what to do when we do a build: we may not have all the fonts in cache, which could cause the build to fall over if a version is updated. I need to think this through Stability is really important for us as we have to check over 1,000 packages in repo which may need updates when a font is updated, plus hundreds of keyboard authors who would also be impacted. Reproducible keyboard package builds is an essential feature. Versioning of fonts is tricky in this respect (edited)

Victor Gaultney 5 minutes ago Yes, esp. given that FLO is a snapshot, not a live update, and doesn’t provide non-current versions.

4:33 IOW on a Monday FLO might say Noto 3.400 is current. On Tuesday Google might update Noto to 3.500. But FLO won’t immediately change. So you could have a user who has 3.500, but FLO says 3.400.

mcdurdin commented 1 week ago

Another capture

marc Monday at 10:28 AM I have a remote referencing conundrum. kmc-copy and kmc-package are both now able to reference github URIs for content, but they have different patterns for referencing: kmc-copy has: github:owner/repo:[branch:]path/to/file kmc-package has: github:owner/repo/raw/git-hash/path/to/file The purposes are slightly different. kmc-copy is a once off action, so we allow branch references. kmc-package is for data in a .kps file, so should be referencing a permanent URI (hence use of the commit hash). Options: We could use a valid https url and match valid patterns, e.g. https://github.com/owner/repo/tree/branch/path/to/file and https://github.com/owner/repo/raw/git-hash/path/to/file , which makes copypasta easier but may make it less obvious why we don't allow certain patterns. We could use kmc-copy pattern for kmc-package : github:owner/repo:git-hash:path/to/file We could use kmc-package pattern for kmc-copy : github:owner/repo/branch/path/to/file Leave them divergent (ugh) What do y'all think? 4 replies

ross Monday at 1:47 PM I think I like option 2. Iike the / used for directory : for different "named" identifiers. Although I realise owner/repo still has a slash Saved for later • Due 1 hour ago

eberhard Today at 2:58 PM I agree that option 1 is probably not the best since it suggests that you can use any URL. Option 3 allows the user to make fewer changes to the copy/pasted URL, so I'd go with that.

marc 1 hour ago Opt 3 should probably be: github://raw//<path/to/file> for kmc-package, for consistency (and so we can allow other terms in place of 'raw' in future. This allows a right-click copy URL and just a very minor edit, for kmc-package. For kmc-copy, I wonder if allowing a paste from GH is actually fine? Which then makes me wonder if a https URL is actually better for kmc-package as well, if we have clear messaging and UI around it. It means users can just open the URL to see what they get, as well. So I am coming around to option 1. Which now means we have 3 different answers to my question! Oops. 3:44 The compiler can provide error messages with precise URL formatting. We can even translate githubusercontent.com urls to the original GH url in the IDE

mcdurdin commented 1 week ago

marc Monday at 10:44 AM Another remote conundrum. We want to be able to support fonts.languagetechnology.org (FLO) for sourcing fonts for keyboard packages. However, we need to have stability, and FLO font versions and references can change without warning. Two issues: Reproducible build requirement means we want a static font version -- how can we guarantee this? We need to bump keyboard version when font version changes, so users get the new font. How can we learn of version changes to fonts? (And can we automate?) 4 replies

davidrowe Monday at 1:31 PM Would the commit hash scheme work to get a fixed reference?

marc Monday at 1:32 PM yes, commit hash is guaranteed fixed reference Saved for later • Due 1 hour ago

eberhard Today at 3:02 PM Looking at the source of FLO (https://github.com/silnrsi/fonts) I see that the fonts seem to have a fontmanifest.json file that contains a version number, so we could query that repo for changes to fonts.

silnrsi/fonts Collection of webfonts for internal use Stars 5 Language HTML Added by GitHub

marc 43 minutes ago Yeah, the FLO data includes a version number, e.g. "version": "1.490", for this font below. "nokyung": { "defaults": { "ttf": "Nokyung-Regular.ttf", "woff": "Nokyung-Regular.woff" }, "distributable": true, "family": "Nokyung", "familyid": "nokyung", "files": { "Nokyung-Bold.ttf": { "axes": { "ital": 0, "wght": 700.0 }, "flourl": "https://fonts.languagetechnology.org/fonts/sil/nokyung/Nokyung-Bold.ttf", "packagepath": "Nokyung-Bold.ttf", "url": "https://github.com/silnrsi/fonts/raw/main/fonts/sil/nokyung/Nokyung-Bold.ttf", "zippath": "Nokyung-1.490-dev-e7e323M/Nokyung-Bold.ttf" }, "Nokyung-Bold.woff": { "axes": { "ital": 0, "wght": 700.0 }, "flourl": "https://fonts.languagetechnology.org/fonts/sil/nokyung/web/Nokyung-Bold.woff", "packagepath": "web/Nokyung-Bold.woff", "url": "https://github.com/silnrsi/fonts/raw/main/fonts/sil/nokyung/web/Nokyung-Bold.woff", "zippath": "Nokyung-1.490-dev-e7e323M/web/Nokyung-Bold.woff" }, "Nokyung-Regular.ttf": { "axes": { "ital": 0, "wght": 400.0 }, "flourl": "https://fonts.languagetechnology.org/fonts/sil/nokyung/Nokyung-Regular.ttf", "packagepath": "Nokyung-Regular.ttf", "url": "https://github.com/silnrsi/fonts/raw/main/fonts/sil/nokyung/Nokyung-Regular.ttf", "zippath": "Nokyung-1.490-dev-e7e323M/Nokyung-Regular.ttf" }, "Nokyung-Regular.woff": { "axes": { "ital": 0, "wght": 400.0 }, "flourl": "https://fonts.languagetechnology.org/fonts/sil/nokyung/web/Nokyung-Regular.woff", "packagepath": "web/Nokyung-Regular.woff", "url": "https://github.com/silnrsi/fonts/raw/main/fonts/sil/nokyung/web/Nokyung-Regular.woff", "zippath": "Nokyung-1.490-dev-e7e323M/web/Nokyung-Regular.woff" } }, "license": "OFL", "packageurl": "https://software.sil.org/downloads/r/nokyung/Nokyung-1.490-dev-e7e323M.zip", "siteurl": "https://software.sil.org/Nokyung/", "source": "SIL", "status": "current", "version": "1.490", "ziproot": "Nokyung-1.490-dev-e7e323M" }, Then the only pattern we need is the version, so do we have flo:nokyung@1.490 or flo:nokyung/1.490 or something else?

mcdurdin commented 1 week ago

My current thinking on flo: references. Perhaps FLO needs to be a tool in Developer which resolves to a GH stable commit reference, as fonts with a url are all GitHub references in FLO.

If we always use GH, then it will be exceptionally rare to have a stable uri disappear -- only if repos are removed or nasty actions like force pushes are done -- in which case we probably want to deal with it anyway.

Additional feature: we can keep the original flo references as well, and use that to provide a hint message to flag when font updates are available, but do not attempt to use it directly.