Open rsheeter opened 1 year ago
@dfrg notes it would be helpful if we explicitly captured what can/cannot be constructed (which is currently hidden/internal)
This is a sketch for the general structure of a schema. The intent here is to figure out a structure capable of representing all of the things that we would like to know about a font table. It is written in a format-agnostic style; we would pick an actual format if we choose to implement this.
this is incomplete. The intention here is to show generally what this would look like, and I can persue it if there is consensus that this is a useful line of inquiry.
Note: the structure I've chosen here is ad-hoc and infinitely bike-sheddable; it can also be discussed if we decide to proceed.
A Type
is a string, which is one of either:
A table object has the following fields:
field | type | required | notes |
---|---|---|---|
name | string | yes | the name of the table |
sfnt tag | Tag | no | the sfnt tag for this table, if it is top-level |
short doc | string | yes | a short description of this table |
long doc | string | no | additional information about this table |
doc link | string | yes | a link to online documentation for this table |
input args | [InputArgument] | no | only if this table requires external data to be parsed |
formats | [FormatTable] | no | a list of table formats. must not exist if 'fields' exists |
fields | [Field] | no | a list of fields. must not exist if 'formats' exists |
An input argument is a name and a type.
field | type | required | notes |
---|---|---|---|
name | string | yes | the name of this argument, used in the containing table |
type | Type | yes | the type of the argument |
A single format of a multi-format table.
field | type | required | notes |
---|---|---|---|
format type | Type | yes | the type of the format value, e.g. uint16 |
format | int | yes | the format value. Must be valid for 'format type' |
table | Table | yes | the Table for this format. |
A field is a named value at a given position in a Table or Record.
field | type | required | notes |
---|---|---|---|
name | String | yes | the name of the field |
type | Type | yes | the type of the field |
doc | string | yes | a short description of this field |
offset | OffsetInfo | no | required if this field is an offset |
count | CountInfo | no | required if this field is an array or sequence |
TK
CountInfo is additional information for computing the length of a sequence or array.
This has two parts. The first is the source of the count value, which is generally either the name of a sibling field or a literal. The second part identifies a possible transformation applied to this value.
field | type | required | notes |
---|---|---|---|
value | CountValue | yes | indicates the input value for computing the count |
transform | CountTransform | no | a token identifying a computation on the input value |
CountValue represents the source for the base input value used to compute the count.
field | type | required | notes |
---|---|---|---|
field | String | no | the name of a field or 'input arg' |
literal | int | no | a literal integer |
all | () | no | a flag indicating that sequence consumes the rest of the table's data |
Exactly and only one of these fields must be present.
The count transform is an enum, serialized as an integer, with the following defined values:
name | value | function |
---|---|---|
MINUS_ONE | 1 | subtract 1 from the input |
DIVIDE_BY_TWO | 2 | divide the input value by 2 |
unhandled: Device table delta values
Awesome, ty. I like it, think it is valuable to pursue, and with my own biases fully intact think this would transform magnificently to something like toml :) I really want to try making a python reader off such a generic schema, I think that would be a very interesting exercise that might surface interesting things.
EDIT: at mild risk of overthinking things, maybe we could have an abnf. My immediate thought is a narrowing of https://github.com/toml-lang/toml/blob/main/toml.abnf.
Once #84 is done we're getting close to the codegen input being language agnostic. It "just" needs to be a form other than Rust and to only have attributes that make sense beyond Rust. Strawman to incite debate: use https://toml.io/en/, it's simple, widely supported, supports comments, and more than sufficient to capture what we need.
Today
Tomorrow