n-ary Intent discussion

dginev commented 2 years ago

I'd like to move the n-ary operation discussion from #252 in a dedicated issue, also so that we collect and discuss a few diverse examples.

Examples

Here are three I picked from the MathML already in "the wild". I will be attaching screenshots since github has no supported way to render math expressions at the moment.

n-ary arrows

The first formula on this tungsteno page, referred to as a "chain of maps", using 8 right arrows, each decorated from above with a d :

the presentation of each operator is:

<mover>
  <mo stretchy="true" minsize="3.0em">→</mo>
  <mpadded width="+0.6em" lspace="0.3em">
    <mi mathvariant="normal">d</mi>
  </mpadded>
</mover>

the second display formula in that page is similar, but the embellishment is different on each arrow.
another interesting bit to notice is that since the MathML is katex-generated, there is no auxiliary mrow structure at all. The top-level of the expression has a single mrow, which contains ~40 direct children, one for each horizontal piece. The parenthetical Ω applications are not grouped, and there are no invisible unicode characters anywhere in the expression.

Equation (2) in the document mentioned in the next section (arXiv:math/0007024), has an example using the long unicode arrow U+27F6, while the one from tungsteno used the standard U+2192:

n-ary subset-of-or-equal

The first display formula in this latexml-converted arXiv:math/0007024 article:

Each operator is a simple element <mo>⊆</mo>,
Since the arguments are embellished, each is wrapped in a <msubsup>, so grouping in presentation follows content
latexml succeeded parsing the structure in this specific case, so the n-ary operation is wrapped in its own mrow, and has clear scope. For instance, the trailing comma is outside the mrow.

n-ary (argument) lists

The derivation after equation (13) in the latexml-converted arXiv:math/0007026 is another one that caught my eye.

These are "just infix commas" (<mo>,</mo>), and each subscripted argument is held in its own <msub>. Since latexml has success in parsing the lists, they're also wrapped in a <mrow>, which is a tool-specific decision.

I'll keep this description examples-only, and we can discuss in the comments.

dginev commented 2 years ago

The n-ary arrows example, especially the embellished one, and then even more so the one with changing embellishments, poses an interesting question. If there is a chain of nearly-identical operations, the narration can still benefit from announcing that, but -at the focused level of reading- it should still mention how each operator is different than the rest. It's also challenging since the n-ary operation <mo> is hidden under various stacked markup, which could itself be diverse - munder, mover, msub, msup, msubsup, mpadded, and of course mrow.
The n-ary subset-of-or-equals and n-ary list examples are nice since they are "painful to look at" to some extent, due to the repetitiveness - I'd glance and think "some hierarchy of structures on M" and "some sequence over the indexes of X" respectively. I'd only start studying the individual terms when I have a purpose that requires going into depth. I think Neil had mentioned that these "narration summaries" are often a good motivation to have markup for the entire notation, even if the arguments are actually spoken as written (i.e. wouldn't need speech changes when spoken in isolation).

I think it's already easy to see the motivation behind the suggestions for "canonical markup" in presentation, and why it comes up so often. But I also understand there is strong consensus that we can't introduce anything new that is "canonical". If even a grouping mrow isn't a viable requirement, then we're left with some rather verbose choices for markup, which in my preferred syntax end up with the taste of $1($2,$3,$4,$5,$6,$7). But it's certainly possible... just may end up more verbose than people like?

Comments welcome!

davidcarlisle commented 2 years ago

I think the subset example and to a lesser extent the chained mapping arrows are a good example of the problems of mapping to a specific content grammar (whether content mathml or openmath or whatever). I'd argue that unlike n-ary + which can be constructed as a sum operator taking a list of arguments of arbitrary length, these are like a < b < c < d where there is no n-ary < operator it's more of a visual shortand for (a < b) & (b < c) & (c < d) so the content structure starts to differ greatly from the presentation. (even more than the example of a negative coefficient in a polynomial) (it is possible to claim that it's something like is-monotonic(a,b,c,d) so really an n-ary operator but....

As such I think pulling out the arguments as $1($2,$3,$4,$5,$6,$7) or similar doesn't really help either accessibility reading or computational evaluation of the expression, I'd be quite happy to see something like

<mrow intent="just read the content"> or <mrow intent="de Rham complex of the content">

which could perhaps be intent="@" or intent="de-Rham-complex(@)" if we had fixed a syntax for special forms.

then each annotated arrow could be <mover intent="homomorphism"> without worrying if the annotations were the same or diiferent in each map in the chain.

All of the above basically a long winded way of saying that I think if we target speech but with a view to allowing conversion to content mathml where possible, we should be able to handle these cases relatively easily (just accepting that conversion to content mathml fails here), but if we want to have an intent markup that disambiguates these chains, especially ones with ... it will require a very complicated inline language inside the intent and at this point really it loses its attraction and we still have the option of using <semantics> and providing a content form explicitly

brucemiller commented 2 years ago

As @davidcarlisle says, these formula are much like $a < b< c < d$, for which there's no normal n-ary operator. An explicit intent would end up looking likeintent="and($2($1,$3),$4($3,$5),...)", which is ugly and presumably has pretty bad speech. Even extra or canonical mrows doesn't really give you that structure. One could introduce a magic operator like LaTeXML's multirelation which expects alternating objects and relations. So then intent="multirelation(@)" might have the same semantics as above, but the speech would be the same as reading it as-is, so it arguably isn't needed for speech.

BTW, if I understood your middle comment, that's what I had in mind with reading "as is" (whether that requires a special intent attribute or not): If any descendants have an intent, they would notbe read as is, but their given intent would be used. Thus, in the tungsteno case, the outer mrow is as-is, but each of the embellished operators could get an appropriate intent.

Finally, my summary is the fear that if we try to target speech and conversion to content mathml, we'll end up doing neither very well.

dginev commented 2 years ago

Comment on example 1:

I had the fortune of finding this nice example spoken in a real lecture that was posted online: https://youtu.be/QLnzIOGIvfo?t=4864

Range of observations:

we hear apply the d, use again the d for the decorated right arrow. Each arrow is presented to be a separate step, while the entire expression is akin to an "itinerary" (to avoid saying "map" and overloading the word further) of where the d-application operator takes the structure.
"From there I go to the n+1 forms by the d "
"i go by d d d" is spoken for the first ellipsis
"and so it continues until" spoken for the final ellipsis.
curious aside: "sequence of maps" is used, clearly a synonym/alias for "chain of maps"

So this definitely feels n-ary, in the sense of an ordered sequence of identical steps. The lecturer appears to try to teach us the mathematical lesson that "applying the d" increases the degree of the underlying form by one, until a special terminating condition is met (the "top form").

Note that just using the narration from the youtube video may be enough to implement good AT behavior for this specific notation, without ever asking "how do we formalize this?". I never need to learn what the underlying "form" is, and what "applying the d" does to the contents of the form. I also don't need to decide whether the single expression needs to be chunked into smaller propositions (the and() approach), or whether this is a special kind of relation (the multirelation approach).

If the arrow is annotated as a map (which is what "chain of maps" and "sequence of maps" suggest), with some appropriate syntax that also attaches the d, likely on the mover, the AT should be able to narrate this reasonably. But whether additional annotations are needed to mention this is the same map used as n-ary infix, or a baseline read through the 40 horizontal siblings will do well enough, that I am still unsure of.

Aside: I don't believe de-Rham-complex will ever be used as part of the expression narration. But it may be valuable to capture the phrase from the text before the mathml expression (the text in the tungsteno page that is) in a span with an id, and then use aria-describedby to connect the two. Then a reader trying to learn more could get de-Rham-complex as secondary speech, if they request it.

Aside 2: I quite enjoyed the explanation of the zero on the blackboard, which ought to reach MathML as <mn>0</mn>, as "the vector space consisting of only one element". So that <mn>0</mn> was not at all the <cn>0</cn>.

Edit: Today I got curious what the d in question was and opened the wiki page for De Rham cohomology for a more detailed read. It explicitly stated "with the exterior derivative as the differential" in describing it. This is another step of formalization I skipped over, and that can/should be skipped over, when relaying the "d" for AT, even though it's likely crucial for formalizing the expression. Turns out the <mi>d</mi> is not just a <ci> in Content MathML, it's some <csymbol> representing the exterior-derivative. As far as AT was concerned, a secondary pointer via aria-describedby could have connected the individual ds to language in the page that presented that explanation.

example 4:

de Rham complexes and cohomologies turn out to be a nice place to look for examples, there is also a very recognizable use of n-ary "direct sum", I believe meant to be written with the unicode circled plus ⊕(U+2295) https://youtu.be/QLnzIOGIvfo?t=1502

which is very similar to the way n-ary plus is used, (the speaker even shortens it to "plus" when he gets tired of repeating "direct"), and I think is also just fine for AT without further annotations beyond <mo intent="direct-sum">⊕</mo>.

comment on Content MathML overlap:

I'm certainly open to leveraging any overlap so that the known computable cases work for both AT and computation. But - in my view - we should not make AT subordinate to formalization, or we reframe the problem in a way where we are unable to speak before we build a piano and learn how to play a classical orchestra score.

dginev commented 2 years ago

example 5:

Back to K12, here is an n-ary arithmetic expression, with other horizontal operators at the prefix/postfix positions:

-w+x+y+z!

<!-- actual tree generated by KaTeX -->
<mrow>
  <mo>−</mo>
  <mi>w</mi>
  <mo>+</mo>
  <mi>x</mi>
  <mo>+</mo>
  <mi>y</mi>
  <mo>+</mo>
  <mi>z</mi>
  <mo stretchy="false">!</mo>
</mrow>

If we wanted a content form here we'd need to either:

"parse" the presentation tree. Then the n-ary rule depends on having proper rules (and precedence) of unary minus and factorial (or whichever other prefix/postfix operators get used).
prepare the correct mrows in advance, to avoid the parsing question, i.e. defer to authoring tools
annotate the content tree in full in the intent value $3($1($2),$4,$6,$7($8)).

If instead we (actually, "a remediator from a small publisher" is the right imaginary actor) were eager to make concessions and only aim to reach an acceptable narration, the only bit to add may be an intent="factorial" on the !, and be content with the baseline presentation read here.

Am I missing other options?

brucemiller commented 2 years ago

Exactly!

davidcarlisle commented 2 years ago

I think, agreeing with Deyan's comment above

If instead we (actually, "a remediator from a small publisher" is the right imaginary actor) were eager to make concessions and only aim to reach an acceptable narration, the only bit to add may be an intent="factorial" on the !, and be content with the baseline presentation read here.

That is, as intent is currently drafted, we don't have a great syntax for long n-aray chains expressed as infix operators. But I thnk it's enough for a first version at least.

For example in the long chain of d arrows in the initial comment we don't have a good (or at least concise) intent synax for the whole chain, but perhaps that doesn't matter and it's enough to have say

<mover intent="exterior-derivative"> → d</mover>

on the individual terms and let the chain itself just take the default reading.

NSoiffer commented 1 year ago

The group decided that explicit intent (listing everything else) or placing no intent is good enough. At least for now...

w3c / mathml