googlefonts / gftools

Misc tools for working with the Google Fonts library
Apache License 2.0
242 stars 71 forks source link

Support Git references in subset merger #987

Closed RickyDaMa closed 1 day ago

RickyDaMa commented 2 months ago

Closes #986

In YAML, you can now do:

- operation: addSubset
  directory: latin-sources
  subsets:
    - from:
        repo: googlefonts/some-font@latest # <- the @ bit is new!
        path: source/some-font/SomeFont.designspace
      force: true
      name: GF_Latin_Core

Which can take any git reference, or the special value "latest" which grabs the tag from the latest GitHub release

These are appropriately cached in separate folders for org/repo/ref

Support for this is has also been incorporated into add-ds-subsets, and I've sprinkled in some type hints in parts of the codebase I've visited

Old PR description (outdated) In YAML, you can now do: ```yml repo: slug: notofonts/latin-greek-cyrillic ref: e7f1736c5ad0dc2abfc4dcd49ebca50abf612b29 ``` Where before repo could only take a string (being the repo slug). This is still supported with the old behaviour of assuming you want the latest of the branch The cache path was previously just the repo slug, but is now the slug + ref, e.g. `notofonts/latin-greek-cyrillic/main` I've added a corresponding `--git-ref` to `gftools-add-ds-subsets` to expose this feature to the CLI as well Opening this PR early without tests etc. to make sure the approach & code style are acceptable - such as the smidge of type hinting I added Let me know your thoughts!
simoncozens commented 2 months ago

This is good, but:

1) I'd rather "repo" be the repo slug and a new optional "ref" field be added. There's no need for an additional level of nesting here. 2) Type hints are good and I want more of them everywhere! But I don't think we need from __future__ import annotations unless you're doing self-referential typing ("Postponed Evaluation of Annotations"). Type hints are supported by default in all the Python versions we support.

RickyDaMa commented 2 months ago

I'd rather "repo" be the repo slug and a new optional "ref" field be added

Sure, I can do that. Or we could do pip-like syntax and have repo support something like org/repo@git-ref, which wouldn't even need an extra field. On GitHub at least, org & repo can't contain @, so splitting on it once should be without edge cases

I don't think we need from __future__ import annotations

They're to be able to use lowercase type hints for collections in Python 3.8 IIRC, e.g.

from __future__ import annotations

def foo(bar: dict[str, str]):
    pass

Instead of needing

from typing import Dict

def foo(bar: Dict[str, str]):
    pass
simoncozens commented 2 months ago

Or we could do pip-like syntax and have repo support something like org/repo@git-ref

I like that.

They're to be able to use lowercase type hints for collections in Python 3.8

Hmm. OK, fair enough!

RickyDaMa commented 2 months ago

Is there any testing that covers the subset merger or add-ds-subsets subcommand for me to hook into, and/or what would you like to see tested of this PR?

I've used it locally through gftools' builder to make sure it works

simoncozens commented 2 months ago

The Noto recipe builder uses it, so it's tested through there.

But here's a thought. I can see a strong case for there to be an option to download the source of the latest tagged release. We do this already in gftools-packager when uploading a font from a downstream repository to google/fonts:

https://github.com/googlefonts/gftools/blob/c14661a8fdbfee8293dda9f6ff2994c173672019/Lib/gftools/packager/__init__.py#L252-L259

That's an option which would be useful for Noto, and potentially for a project that you might be working on: when a rebuild of the language-based font is triggered, the build automatically picks up any stable updates from the subset donor font, without including any unreleased development state.

RickyDaMa commented 2 months ago

Yes, that could be good. I'll take a look at making a special case for org/repo@latest (bikeshedding welcome)

RickyDaMa commented 2 months ago

Fun niggle: using the GitHub API for the zipball download link downloads a zip that has the inner top-level folder named with a completely different convention to usual

Normally, it's repo-tag*, however in this case it's org-repo-tag_sha which is super helpful because the response metadata from the get latest release call doesn't contain the tag SHA (target_commitish points to the commit)

Would you rather I make another API call to get the tag SHA and can properly recreate the inner folder name for precise extraction, or shall I just hack something to make it work?

(*where if the tag name began with a "v" and looked like a version, the "v" has been stripped)

simoncozens commented 2 months ago

shall I just hack something to make it work?

There's only one top-level folder, right? So I guess you can just glob for it.

m4rc1e commented 2 months ago

Thanks for the force push. Seems to be working:

Screenshot 2024-06-27 at 10 11 53
RickyDaMa commented 3 weeks ago

@simoncozens (or @m4rc1e) any going concerns that'd prevent this PR from getting merged soon? I think it's in a good state; just updated the original post description to reflect the current state of the feature