Open davelab6 opened 5 years ago
@davelab6, maybe we should consider in this markdown-based proposal, a mechanism for describing the downgrading of check log results (such as from FAIL to WARN) when inherited from one profile to another that is under discussion at #2356
I think that profiles would be simply defined as directories. So, in the root of the font-knowledge repo, inside a profiles/adobefonts
directory we would have the domain directories such as com.google.fonts
and then inside them, the check
directory containing the markdown files. A full path in the repo could look like this: profiles/adobefonts/com.google.fonts/check/required_tables.md
If a profile inherits a check, then this could potentially be represented by a symbolic link. But that would not be enough for addressing the customization of log results statuses...
Maybe for that (#2356) we can come up with a kind of inheritance mechanism together with a literate programming style of describing the down-grade of check-result statuses and the human-friendly text describing the reasoning behind such customization.
Something like this on a file called profiles/adobefonts/com.google.fonts/check/dsig.md
:
# Does the font have a DSIG table?
This check is inherited from the Google Fonts profile.
Adobe Fonts considers that the problem has a lower severity,
treating it as a mere WARN-level check.
## Warning
This font lacks a digital signature (DSIG table).
Some applications may require one (even if only
a dummy placeholder) in order to work properly.
You can add a DSIG table by running
the `gftools fix-dsig` script.
In that case, the mere fact that both markdown files have the same check-id in their path is enough to identify the inheritance. The engine could then automatically use the python code from the inherited check.
Too much auto-magical behavior can be a pain too. For me implicit inheritance by matching file names crosses the border. I'd much rather have explicit statements where it becomes obvious what is supposed to happen.
I've been thinking, and I am not so sure we should keep the markdown files in a separate repo. I think I would continue development on a single repo for a while until some more compelling reason surfaced to have a repo split.
I believe that the best approach would be to add support in the current fontbakery engine to parse and execute this new markdown check format, but still support the existing approach. This would enable us to gradually migrate to the new style while ensuring correctness and not changing the user interfaces.
I think the best approach would be to compile the markdown check format into the current profile/check format, i.e. regular python, and then just run it. As I discussed in https://gitter.im/fontbakery/Lobby?at=5cf53789faac643934330dcc ff, picking just the points:
We could maybe build the profiles py files on install or on first use and then rebuild when the source checksum has changed or such.
Plus using source maps would be good.
Since you can’t stick a md file into a python interpreter and run it directly, we need to compile it anyways.
Caching of the compiled program seems sensible to me, should be transparent though. Like pyc files (pycache). Maybe we can use parts of that implementation 🙄 We should research and if needed create a py-literate-programming project/library IMHO.
I think that way it's good to implement, the python format can be debugged well, it's comprehensible what is happening, the engine doesn't need change, or not much at least and the current format can be kept.
For the markdown format, I don't see a sensible way to declare important information, like e.g. the conditions, without explicit code or data structures markup. It's going to be painful to go fishing in a see of markdown for code level information. And it implies to the authors that they can use human language, while in fact they have to use a dictionary of well known terms plus making it harder to use e.g. grep
and find both the implementation and where it is used. We can come up with markdown styled formats that are explicitly parsed in a well defined way, e.g.: ```py [...] ```
will be executed as python in the current local scope, I can live well with that. But the conditions etc. example from above is scary to me, thinking about both implementing it and using it.
I'm still a friend of the front matter style of adding data to markdown, which uses yaml:
conditions:
Also, explicit is better than implicit.
We should maybe also take a look at the observablehq file format (yeah it's about js but lets just look at the file format) as they have actually structured data. Maybe write our own notebook style interface for editing which also evaluates live (or can ipython notebok be heavily customized)?
I'd look for a format that can be translated with simple rules from markdown to profile, and ideally also back again to some extend, thus we could start with some generated markdown files from our existing profiles. The proposal above feels to me like making it hard to parse AND hard to write and hard to debug.
I think the observablehq term of structured data in a document is cells
and that could be a way to structure the data here as well. It may not render as smoothly on GitHub or as an Ipython Notebook, but at least we can define the syntaxes and capabilities of each kind of cell explicitly and also have a well known way how to structure the document and then split and parse it. None of the notebook formats try to be actual markdown documents AFAIK, these are probably just stored as JSON or in some other structured data format. I believe that is for good reason. Because that kind of thing leads to bad decision making, where the actual format must follow the host format syntax. For me personally, I believe using CSS language syntax as a constraint for the CPS property language syntax, led to many bad design choices.
continue development on a single repo for a while until some more compelling reason surfaced to have a repo split.
The reason is that the repo is then clean and simple, and people may contribute without getting overwhelmed with code.
Overall I think this idea is only valuable if it is very fast to execute on - takes Felipe 1-2 days to make the parser.
Otherwise, we would be better off having a SPA web app that allows users to contribute improvements to checks using a UI. Then there's no editing text files, and there's immediate data validation, ways to pick values from predetermined lists, etc.
I think that I will then leave the issue open but not deal with it immediately. Leave it slow-cooking for a bit longer... anyone feel free to provide further input here. At some point we may put some effort on an actual implementation.
note: I want to try keeping everything in a single repo because I don't want to unnecessarily risk fragmentation on the dev community we've been gradually gathering/growing.
The README.md on a repo can be an easy entry point to the collection of checks, by using internal hyperlinks pointing to the rest of the markdown files.
Observed behaviour
Currently the checkrunner allows us to create python files in the
Lib/fontbakery/profiles
directory with relatively little boilerplate for someone comfortable programming Python.However, our checks are not always complete and correct the first time they are committed to the
master
branch and included in a release, and users find what is missing or what the problems are - but are not comfortable with Python code syntax, and are unable to read the source code.To tell users more about what they checks are doing, we added docstrings to the source code. But then we realised we needed a little more structured data for users, such as defined rationales, and a dictionary of "misc metadata."
Here's an example; the first 38 lines of
Lib/fontbakery/profiles/fvar.py
:And this is paired with the code test in
tests/profiles/fvar_test.py
:Overall, this is really just as impenetrable as uncommented Python code for our users; and its worse for people fluent in Python because it adds a lot of boilerplate.
Also having the check and the test in separate places isn't ideal, I think.
Expected behaviour
I would like to suggest radically changing the format of our check files; ideally this is a "non functional" change, that doesn't effect the command line or web UIs; it is more a community process improvement sort of thing. And it is something that can be done incrementally.
The 'big idea' is to invert the priority of code and docs, so that the documentation really a "first class citizen", and the check code is relegated to a second class - but that is more compact, without any 'human knowledge' mixed in with Python code.
This will be better for users because the Python code will become more minimal - while code comments will be retained in code, the long explanatory strings will move to a human friendly document above the code, and the python boilerplate code will almost entirely be eliminated.
Therefore I propose creating a new repo called something like "The FontBakery Knowledge Project" which is a repo that only contains markdown files with check code; and then this fontbakery repo becomes focused on the check runner itself, which parses those files according to the Ruby on Rails maxim of "implicit is better than explicit" and "convention is better than configuation" :)
This is of course an old idea, so called Literate Programming; and the current trend for literate programming in Python is the Jupyter Notebook (
.ipynb
) file format. This contains a mix of markdown and python "cells" in a JSON format structure, and there are various UIs and runtimes where users can run them sequentially or all at once.However, for our purposes, a markdown file using the Github Markdown style code formatting means we don't need to use a Notebook viewer app, we simply have clean markdown files in the git repo.
As first step, @felipesanches and I mocked up what the above check would become like in this case; a file with the path
com.google.fonts/check/varfont/regular_wdth_coord.md
- which implies the check ID - and then this contents (source here):------- 8< --------
Is the 'wdth' axis Regular at 100?
If a variable font has a 'wdth' (Width) axis, then the coordinate of its 'Regular' instance is required to be
100
, according to the OpenType specification's "registered axis" definition ofwdth
, available at https://docs.microsoft.com/en-gb/typography/opentype/spec/dvaraxistag_wdthConditions
Run this when the font...
Pass Message
Regular:wdth is 100.
Fail Reason
The 'wdth' coordinate of the 'Regular' instance must be 100, but instead the default value is {regular_wdth_coord}.
Related Information
Where was this requested?
https://github.com/googlefonts/fontbakery/issues/1707
Fonts that will pass this check
cabinvfbeta/Cabin-VF.ttf
Fonts that will fail this check
cabinvfbeta/Cabin-VF-bad-wdth.ttf
Check the fonts
Test the check
Our reference varfont has a good Regular wdth coordinate, so it should PASS the test:
We then change the value so it must FAIL:
Then re-read the coord:
and now this should FAIL the test:
------- 8< --------
Which on Github renders nicely, of course; and on Jupyter too:
Going line by line...
An H1 and should be possible to use as a docstring - for example, the command
fontbakery -L
lists all checkIDs available, just as their IDs, line by line, and this is used for the bash autocompletion; whilefontbakery -L --verbose
list all checkIDs followed by their docstrings - which ideally fit on 1 line.Under that is the "rationale" text; here I rewrote it to be a little more fluent than the current check rationale string, and adding some markdown formatting.
An H2 that is an "expected" string,
Conditions
, which is implicit structure in the documentUnder that is some helpful but non-functional text
Run this when the font...
that can be omitted.The list items (ordered or unordered) are expected strings though; they should be values that exact regex match to keys that are the conditional strings in the codebase today.
Next is a H2 followed by the first pass state; if there are several pass states, these should be either Markdown paragraphs, or list items.
This is followed similarly by fail states. Note here that there is one reason that uses a Python formatted string named value; this could be a positional value (
{}
) tooNext is the "misc metadata" and some more 'expected' titles; I split out the test font file paths to their own section (and added a FAIL font) so that a non-Python fluent designer can write a document like this without having to provide the check code, or if they do, they don't have to provide the test code, but they can provide a pair of fonts and we can take care of it.
The test for this check was already with very verbose code comment which made it already very close to a 'literate programming' style already.
Finally, the new check file format will itself need a checker, and code tests for that checker, to remind us to include optional but helpful texts, and check that the expected section titles or conditional strings are correct ;)