w3c / mathml

MathML4 editors draft
https://w3c.github.io/mathml/
Other
60 stars 18 forks source link

Proposed intent with functional notation #451

Closed davidfarmer closed 1 year ago

davidfarmer commented 1 year ago

Proposal for a functional grammar for intent

This seems workable on the examples we have discussed, with proper markup (meaning judicious use of mrows) and recognizing that micromanaging the pronunciation often makes things worse.

Challenging cases welcome, of course! In particular, examples that were headless or leading underscore under other proposals.

{comments in curly brackets}

knownintent       = we have to decide on the list {e.g., 'absolute-value', "superscript", "number", ...}
reflist           = '(' '$' identifier (',' '$' identifier)* ')'
namereflist       = '(' names (',' '$' identifier)* ')'
identifier        = letter+
letter            = we have to decide what is a "letter"
name              = letter+ (('-'|\s) letter+)*   {note: spaces are allowed.  Not a deal-breaker.}
names             = name ('|' name)*
numbervalue       = we have to decide if . and , are both allowed, and minus sign { why not? }
type              = 'named' | 'adhoc' | 'value' { "isa" is sort-of a type, but is treated differently }
category          = 'function' | 'group' | 'number' | 'operation' | 'system-of-equations' | ...  {several more things}

intent            = (knownintent reflist?) |
                    (isa ":" category) |
                    (type ":" category namereflist)
                    (type ":number" "(" numbervalue ")")

Example of knownintent

<mrow intent="absolute-value($x)"><mo>(</mo><mi arg="x">A</mi><mo>)<mo></mrow>

Example of isa

<mrow intent="isa:system-of-equations">ABC</mrow> tells AT that ABC is a system of equations. Similarly for isa:matrix and isa:cases.

Things like isa:operation or isa:group probably have no effect initially.

Examples of named

<mi intent="named:extension(free algebra)">R</mi>

<mi intent="named:function(Bessel function of the first kind|Bessel-J)">J</mi>

The "named" type is used to indicate that the item has an existing name. The "|" separate different names, with the more verbose coming first. (We can consider omitting the option to have multiple names.)

The "named" type tells AT that it can use the literal value if desired.

Examples of adhoc

<mo intent='adhoc:operation(foo)'>&#x229e;</mo>

(that symbol is a plus in a box)

The "adhoc" type is used to indicate that the author is making up the name, or that the name is nonstandard. There should not be "|" alternatives for an adhoc name.

The "adhoc" type tells AT that it can use the literal value if desired.

Examples of value

<mo intent='value:operation(times)'>*</mo>

The "value" type is used to indicate that AT should use that value instead of the literal content. There should not be "|" alternatives for a value name. (value is implict for knownintent)

Some special cases

The "superscript" core intent is used to have the correct pronunciation for things like

<msup><mi>H</mi><mn intent="superscript">2</mn></msup>

While it is true that (in the context of (co)homology) a person would pronounce H^2 as "H 2", they also would pronounce H_2 the same way. The superscript intent tells AT that the 2 is just a superscript/index, not a power, so it will probably say "H sup 2", which is better.

The "number" intent (is it too confusing to have it as both an intent and a category?) covers cases which were mentioned on a call, such as:

<mrow intent="number(3.14)"><mn color="red">3</mn><mo>.</mo><mn color="blue">14</mn></mrow>

In this, and many examples, it is necessary to have suitable mrows in order to fit the proposed intent grammar. (This allows keeping numbers out of the arguments of intent, except inside the number intent or number category.)

Some special features

In many cases, at least with the initial implementations, the "category" is ignored and

intent="named:X(foo)" is probably pronounced "foo", no matter the category X.

In some cases the "category" can be a useful signal to AT. For example, if the category is "function" then AT can know to say "of" before the reference.

Otherwise, the AT just says the name and the references in order.