fonttools / fontbakery

🧁 A font quality assurance tool for everyone
https://fontbakery.readthedocs.io
Apache License 2.0
549 stars 101 forks source link

Fontbakery redesign proposal #4144

Closed yanone closed 1 year ago

yanone commented 1 year ago

I'm picking up on Felipe's recent reports that he's considering a rewrite of Fontbakery. I want to add my own angle, as I'm increasingly involved in check-writing:

IMO, Fontbakery is too monolithic (too much in one file), too intransparent (too much magic, too many method parameters get created at runtime out of nowhere), too un-Pythonic (unfamiliar code formatting choices, intransparent code imports [re magic]).

Take this check for example:

def com_google_fonts_check_usweightclass(ttFont, expected_font_names):

It contains two parameters, but any of them can also be omitted. No one can understand where expected_font_names comes from, because it's not getting imported in a Pythonic way. This is opaque and difficult to understand, but also inhibits modern-era tooltips that IDEs can display for what expected_font_names is and how it behaves. This is further amplified by examples where you can't properly set up a test for a check because you can't manually recreate the same parameters that Fontbakery creates magically at run-time. Fontbakery is trying to be too intelligent here at the expense that only few people understand how it actually works.

Besides, expected_font_names is "imported" further up in the check as a "condition" (still not revealing in which file that is hosted), but expected_font_names is not a condition, but instead data, or maybe a method. So this is all very misleading and difficult.

Also, having ttFont as a parameter is increasingly often insufficient, because fonts get checked using tools other than fontTools, such as harfbuzz. We need a file path for that, not a ttFont object (even though I'm aware that a path can be extracted from a ttFont object, but not everyone knows that.) Fontbakery is quietly assuming fontTools-based checks, but it shouldn't assume that. Checks can be performed using any number of tools. I'm currently unaware why any data other than a font path is necessary to be handed over to a check.

The amount of code in a single file is intimidating and difficult to maintain.

The code is not formatted in a way that for example VSCode can understand, leading to constant trouble in accidentally (but correctly) saving a file in VSCode and the code being reformatted in Black style that is unacceptable in Fontbakery.

Debugging prints need to be visible in the output. If not print(), at least logging.debug() statements need to be visible directly (specifying --loglevel DEBUG in the command line is okay of course), but yield DEBUG, "..." is unpythonic and intransparent to call for inexperienced users.

Overall, Fontbakery is very difficult to author.

I propose profiles and checks to be organized in a folder structure where each check (may) sit in its own file:

profiles
 └ googlefonts
    ├ conditions.py
    └ checks
       ├ usweightclass.py
       ├ nametable.py
       ├ ...

and then each file contains classes that inherit from the main Check class. The file name is insignificant (should be identical to check class for single-check files.) This allows foundries to accumulate subsets of all the checks they want in a single file, or to group related checks into one file, but also allows the complete separation of checks into single files.

Fontbakery runs whichever decendants of the the main Check class it can find in a file. The command line -c argument translates to the highest-level classnames defined in each profile.

Putting code-tests into the same class further reduces the friction in authoring checks.

from fontbakery.checkrunner import Check, Message
from fontbakery.status import INFO, WARN, ERROR, SKIP, PASS, FAIL, DEBUG
from fontbakery.profiles.googlefonts.conditions import is_ttf
from fontTools.ttLib import TTFont

class usweightclass(Check):
    rationale = "..."
    conditions = [is_ttf]

    def check(self, fontpaths):
        ttFont = TTFont(fontpaths[0])
        ...
        yield FAIL, Message("...")

    def test(self):
        """Code test directly here in the same file."""
        ...

        assert PASS in self.check("path/to/font.ttf")
        assert FAIL in self.check("path/to/otherfont.ttf")

Foundries may assemble their checks purely from other foundries' profiles all in one file if they wish. Of course, foundry may host their profile independently of the fontbakery package as is already the case today (I think).

profiles
 └ myfoundry
    └ checks
       └ allchecks.py

And subset checks en-masse like that (or in individual files):

from fontbakery.profiles.googlefonts.checks import usweightclass
from fontbakery.profiles.googlefonts.checks import nametable
from fontbakery.profiles.adobe.checks import nametable as adobenametable

class usweightclass(usweightclass.usweightclass);
class nametable_ID1(nametable.nametable_ID1);
class nametable_ID2(nametable.nametable_ID1);
class nametable_ID1617(adobenametable.nametable_ID1617);

Needless to say, Black formatting is the standard in font tools today and Fontbakery shouldn't be an exception here IMO.

m4rc1e commented 1 year ago

In 2021, me and Simon thought about a better way of syncing the docs, tests and fixes so we prototyped font doctor (mainly Simon). Currently, our docs, tests and fixes live in separate repos so they get out of sync very quickly. The architecture Simon came up with means you can also fix on sources as well.

felipesanches commented 1 year ago

Also, having ttFont as a parameter is increasingly often insufficient, because fonts get checked using tools other than fontTools, such as harfbuzz. We need a file path for that, not a ttFont object (even though I'm aware that a path can be extracted from a ttFont object, but not everyone knows that.) Fontbakery is quietly assuming fontTools-based checks, but it shouldn't assume that.

This is not true. A check can have work on a font (instead of a ttFont) and that will give it a file path string instead of a fontTools TTFont object.

felipesanches commented 1 year ago

I like the idea of having one check per file, and then placing the corresponding code-tests in that same file.

miguelsousa commented 1 year ago

I don't like the idea of having the unit tests in the same file as the tool's main code because we'd be unnecessarily shipping test code (and eventual bugs) to non-developer users. It's a hard no for me.

felipesanches commented 1 year ago

This is further amplified by examples where you can't properly set up a test for a check because you can't manually recreate the same parameters that Fontbakery creates magically at run-time.

This is not true. The CheckTester helper class used on our code-tests does compute the dependencies of a check for us (so that we don't need to do it manually) in order to facilitate implementation of tests. And if you need to fake/mock-up a certain value, there's also a mechanism for that. There are many examples of that on the test-suite, as well.

felipesanches commented 1 year ago

It is a good thing that the dependencies are computed automatically, as there would be not much benefit in doing something like this:

from somewhere import (one_thing, another_thing)

@check 
def com_blah_check_something(font):
    one_thing = one_thing(font)
    another_thing = another_thing(font)
    some_data =  ... # inspect the font here
    do_something_with(one_thing, another_thing, some_data)

instead of this:

@check 
def com_blah_check_something(font, one_thing, another_thing):
    some_data =  ... # inspect the font here
    do_something_with(one_thing, another_thing, some_data)

In the first scenario, you'd be importing a function that needs to be evaluated. In the second scenario, which is what we have nowadays, the one_thing and another_thing values are computed (and cached) by fontbakery. So, they are actually values, not functions. And there's the benefit of caching in cases where multiple checks are using the same computed information on a given font (or set of fonts) being checked.

It seems to me that the only benefit of the more explicit approach you're proposing would be in making it easier to see where a dependency is implemented. But that's so easy to find out using grep! Or any other code-base search tool that programmers are user to. Also, if the codebase is well organized, we do not even need to search it, because all dependencies will be declared in the same reasonable place (such as the dependencies.py files, or directly on the same file as the one where the check is implemented)

felipesanches commented 1 year ago

Also, the class declaration that you suggested is more cluttered/verbose (causing unnecessary friction) in comparison to the current approach. I'd prefer less code whenever possible, which is one of the guiding principles that led to the implementation we have nowadays.

graphicore commented 1 year ago

Debugging prints need to be visible in the output. If not print(), at least logging.debug() statements need to be visible directly (specifying --loglevel DEBUG in the command line is okay of course), but yield DEBUG, "..." is unpythonic and intransparent to call for inexperienced users.

If you don't use the terminal progress bar (-n option), your print() output won't be consumed by the update mechanism of the progress bar.

madig commented 1 year ago

IMO, Fontbakery is too monolithic (too much in one file), too intransparent (too much magic, too many method parameters get created at runtime out of nowhere), too un-Pythonic (unfamiliar code formatting choices, intransparent code imports [re magic]).

I agree with all of this, and it's why I don't like to touch the codebase.

Some magic is okay (note how function parameter introspection is also used by pytest, which is pleasant to use), but the import machinery is hard to understand, for benefits I don't see. When I wrote a bunch of profiles, I had to scratch my head at https://font-bakery.readthedocs.io/en/latest/developer/writing-profiles.html many times, until I just started copy-pasting from actual profiles. The core code would be much improved if a lot of the import magic would be ripped out and replaced by those extra 5 lines of Python code.

The other issue I stumble over sometimes is that some checks seem to output Markdown, others manually formatted strings, and the terminal reporter gets confused about which is which. Maybe checks should always construct a structured (Markdown) report, so that the reporter can do something useful on the terminal and in HTML.

felipesanches commented 1 year ago

Last month I made a few sketches and shared with my team on a weekly videocall, but ended up forgetting to post them here:

photo_2023-05-05_14-14-33

yanone commented 1 year ago

I’ll keep adding to this document as I find or remember flaws:

A MAJOR flaw of fontbakery, at least in the GF profile, are conditions that quietly don’t fire because someone messed with them so that checks are skipped even though they're present and should work. This happens time and time and time again. In theory, thorough code tests should catch that, but in practice they don't.

I don't have a solution at hand for this, but it's a major design problem.

felipesanches commented 1 year ago

A MAJOR flaw of fontbakery, at least in the GF profile, are conditions that quietly don’t fire because someone messed with them so that checks are skipped even though they're present and should work. This happens time and time and time again. In theory, thorough code tests should catch that, but in practice they don't.

I'd like to see a few examples of this.

yanone commented 1 year ago

This condition: canonical_stylename. I don't know how many more issues there are like that in the code but it never got updated after VFs became a thing so it's quietly not working for VFs. I'll make a separate issue for it.

And it's so frustrating because I don't even know how to debug that properly since the conditions are somehow magically invoked. There is no way to even access them during the check and print out something.

So I understand, it's not the fault of the conditions per se. No one looked at the code thoroughly after VFs were introduced.

yanone commented 1 year ago

Thinking about it further, it is indeed a design flaw of the conditions that they can quietly fail.

A simple fix would be to require either a value or False to be returned, and fire an ERROR during runtime when None is received, as is usually the case for quietly dysfunctional conditions. But that may not work for some conditions.

The better approach is probably to define what types of return values constitute a successfully executed condition. Since it's in many cases probably not possible to know the full set of values beforehand, a list of expectations for the canonical_stylename conditions could be defined as ['non_empty_string'], ruling out False and especially None as acceptable values.

m4rc1e commented 1 year ago

And it's so frustrating because I don't even know how to debug that properly since the conditions are somehow magically invoked. There is no way to even access them during the check and print out something.

Do you use pdb? I find it incredibly valuable to debug these types of things. I hardly ever use print statements for debugging, I just do import pdb;pdb.set_trace() and have a poke around instead.

m4rc1e commented 1 year ago

Just seen you can also do the above by doing breakpoint()

madig commented 1 year ago

Alternatively, write a test case and give the condition what it expects (i.e. the ttFont argument means the actual fontTools.ttLib.TTFont object).

felipesanches commented 1 year ago

yeah. If the condition is famous for being misbehaving for a long time, then adding more code-tests to tame the beast is definitely necessary

yanone commented 1 year ago

They are several different conditions that are misbehaving, and you never know which one. That's design flaw. More code tests are the workaround.

madig commented 1 year ago

Yes, but tests are all we have currently, until someone comes up with machine-aided proofs or other magic.

Can you pin down which checks fail due to faulty conditions? Then it should be possible to test the whole stack in one go.

yanone commented 1 year ago

Reminder: I created this issue to gather ideas for a future rewrite of fontbakery.

I already created another issue for the specific problem of the current fontbakery: https://github.com/googlefonts/fontbakery/issues/4178

yanone commented 1 year ago

Actually, the situation with the quietly failing condition is a prime example of why the code verbosity decisions are problematic:

For the sake of cleaner code, canonical_stylename is invoked as a "condition" here. But how is it a condition?

A condition is either met or not. In other words, True or False. bool(anything) == True is interpreted as the condition being met, while bool(anything) == False is interpreted as the condition not being met. That works for is_ttf, which is actually a condition, and is fine to be defined for a check like that (despite me criticizing its invocation through a string rather than an imported reference and it magically becoming part of the method's interface).

canonical_stylename on the other hand is supposed to return a certain value depending on the style. Here, the quietly failing "condition" is interpreted as False. canonical_stylename is not a condition.

Had canonical_stylename be invoked normally as a method and executed normally in the check's code, I’m 99.5% certain that a return value of None or '' would have tripped the check logic and we would have seen lots of FAILs instead of SKIPs and the problem would have been found way earlier.

In other words, the code verbosity and subsequently the usage of conditions are a design problem in this case.

And by the way, canonical_stylename doesn't have to be invoked as a condition. There are some checks that invoke it directly as they should. This can be fixed even with the current fontbakery, so it's really not like we can't do anything about it until some "other magic".

I’m gonna update the other issue with a concrete action plan to get rid of this problem for the time being.

graphicore commented 1 year ago

About the conditions, I'm not sure if that should be part of a big redesign proposal. It sounds more like a behavior that could and maybe should just be fixed right away. But also, in general, when assumptions change, like the shift from single-style statics to variable fonts, I'd always expect some friction. The condition could itself be written/rewritten more thoroughly/resiliently and maybe raise itself an error. As a reference:

@condition
def canonical_stylename(font):
    """Returns the canonical stylename of a given font."""
    # [ ... ]
    if (
        "-" in basename
        and (s in VARFONT_SUFFIXES and varfont)
        or (s in valid_style_suffixes and not varfont)
    ):
        return s

This returns either a string or None, I'm sure, with the change you suggested:

A simple fix would be to require either a value or False to be returned, and fire an ERROR during runtime when None is received, as is usually the case for quietly dysfunctional conditions. But that may not work for some conditions.

… that example would originally have been written to just return False instead of None and nothing would be gained.

It could also be rewritten to return a string or raise an Error, but in that case, maybe a check that FAILS would be more appropriate?

They are several different conditions that are misbehaving, and you never know which one. That's design flaw. More code tests are the workaround.

But that's a design flaw of the conditions itself, isn't it? I mean a function behaves correct or it does not, how could I tell from looking at it from the outside? Sure, fixed return data types can help, but they don't remove the core issue, which we cannot solve.

graphicore commented 1 year ago

@yanone can you give an example where canonical_stylename is used as you describe invoked as a "condition" here., I'm trying to follow you here.

yanone commented 1 year ago

I just finished typing: https://github.com/googlefonts/fontbakery/issues/4178#issuecomment-1589325360

Now the whole issue is a bit constructed, because in reality the affected issue quoted there is both deactivated (but shouldn't be), and not using canonical_stylename as a condition, but instead stylenames_are_canonical. I'm trying to make a general point here.

And we did trip over this previously where another condition didn't fire for years and it was very similar, but I forgot which one. But because I can't know what all is affected, I call this a design flaw that should generally be addressed.

yanone commented 1 year ago

Basically: any condition that doesn't start with "is" or "are" is not a condition. Pick any example you like. This can and should be changed.

graphicore commented 1 year ago

There are two cases where we use the term conditon:

One is @condition, this defines a calculation of a value that will be used as an argument to a @check or another @conditon. The other use case is to skip @check executions depending on such a computed value and it's truish-ness.

yanone commented 1 year ago

One is @condition, this defines a calculation of a value that will be used as an argument to a @check or another @conditon. The other use case is to skip @check executions depending on such a computed value and it's truish-ness.

And these two things need to be separated cleanly.

graphicore commented 1 year ago

And these two things need to be separated cleanly.

Yeah, I agree. The concepts should eventually have different names. Here's a proposal attempt:

As transition to new behavior, one way for now, could be to explicitly mark an @condition to be allowed as a condition in @check, e.g. @condition(as_conditional=True) and in that case enforce conditional protocol to have it return always only True or False.

New names could be @parameter and @conditional. They could be implemented as parameter = condition (same old thing) and conditional = functools.partial(condition, as_conditional=True). The reason, why I'd keep these the same below the hood is that a @conditional could still be used as a @parameter, thus prevent code duplication, and because they are implemented the same anyways and I see no reason to change that.

The phase of transition could be arranged so that conditional protocol first is optional. In a later version, if not used emits deprecation warnings (give everyone a chance to update) and eventually ERRORs. I'm not sure how to handle deprecation warnings right now, e.g. whether yielding WARN (or INFO?) is sufficient. Probably overkill, but possible could be to explicitly override the default conditional protocol setting in the @check definition.

conditional protocol:

miguelsousa commented 1 year ago

A MAJOR flaw of fontbakery, at least in the GF profile, are conditions that quietly don’t fire because someone messed with them so that checks are skipped even though they're present and should work. This happens time and time and time again. In theory, thorough code tests should catch that, but in practice they don't.

I don't have a solution at hand for this, but it's a major design problem.

@yanone this can be solved quite easily by adding unit tests that assert all expected outcomes of all the tests. The googlefonts profile has 75% code coverage. If coverage on that file is increased to 100% (or close to), any changes to conditions will get tested as well.

I don't have a strong opinion on how and if Font Bakery should be re-architected. But I will say that, while working on contributions to the codebase, I've faced many of the issues already mentioned. I've started a fork named OpenBakery, and I welcome contributions from anyone and everyone 🙂

felipesanches commented 1 year ago

I am monitoring all commits made to the OpenBakery repo and porting them back here because there's nothing I disagree about what is being done there.

miguelsousa commented 1 year ago

I am monitoring all commits made to the OpenBakery repo and porting them back here because there's nothing I disagree about what is being done there.

@felipesanches while it's great to see Font Bakery moving in the same direction, I which you would respect my work by acknowledging in the commits themselves where you're getting the code changes from. I see you've started cherry-picking yesterday. I don't think it's too late to rewrite the commit history to add credit where credit is due. I'll also mention that you may want to be more careful with the changes you're porting. I can't say for sure because I don't have admin access to the services that Font Bakery uses, but I suspect that a recent commit to one of the workflows has made it stop working.

felipesanches commented 1 year ago

Sure, I can rewrite those commits to mention your name, no worries. I'll do it on my next work-day.

felipesanches commented 1 year ago

I'll also mention that you may want to be more careful with the changes you're porting. I can't say for sure because I don't have admin access to the services that Font Bakery uses, but I suspect that a recent commit to one of the workflows has made it stop working.

In cases like this, please open an issue.

miguelsousa commented 1 year ago

Sure, I can rewrite those commits to mention your name, no worries. I'll do it on my next work-day.

Thanks. Please don't include my GH username in the commit, otherwise my inbox will get overwhelmed with notifications.

felipesanches commented 1 year ago

Actually, it is not good practice to rewrite those commits because they are already included on the public main branch. But the very next one can reference them by commit hash and I'll mention you (but not using @ username, as per your request) on all future commits porting from your personal repository. I expect you to accept it as a gesture of good faith.

felipesanches commented 1 year ago

Here's a link to a spreadsheet I've been working on that helps me keep track of what @miguelsousa does on his personal repo. I treat it as effectively equivalent to a list of pull requests ;-)

https://docs.google.com/spreadsheets/d/1mMxqJWrptRci1CriNyuJHrTVdKs95NzI6ImF72T6zaA Screenshot from 2023-06-13 22-18-00

miguelsousa commented 1 year ago

In cases like this, please open an issue.

Felipe, respectfully, I have better things to do with my time than to validate what you commit. As you copy things over, I suggest you to skip any changes that you don't fully understand. Or, you could ask someone to review the changes; after all that's the process that's spelled out in the CONTRIBUTING.md file you've copied from the OpenBakery repo. Just don't ask me. I don't want to re-review my work. 😉

felipesanches commented 1 year ago

but you just did ;-)

miguelsousa commented 1 year ago

Actually, it is not good practice to rewrite those commits because they are already included on the public main branch. But the very next one can reference them by commit hash and I'll mention you (but not using @ username, as per your request) on all future commits porting from your personal repository. I expect you to accept it as a gesture of good faith.

Understood. Accepted.

simoncozens commented 1 year ago

I agree with many of the points raised, and I have racked my brains several times over the best way to re-architecture fontbakery. The idea of putting every check into a separate file is one I raised a long time ago. My goal in doing that would be that "profiles" could eventually become a YAML file listing checks to run, rather than a curious mix of check IDs and code.

However, this is a discussion rather than an issue to fix in fontbakery today so I'm converting it to a discussion.