w3c / mathml

MathML4 editors draft
https://w3c.github.io/mathml/
Other
63 stars 19 forks source link

Deprecate/Remove mlabeledtr #72

Open fred-wang opened 5 years ago

fred-wang commented 5 years ago

https://mathml-refresh.github.io/mathml/chapter3.html#presm.mlabeledtr

mlabeledtr is supposed to allow equation labels that you can refer easily elsewhere for example:

<p>blah blah</p>

x + y = z                 (5.1)

<p>blah blah see <a href="#id">(5.1)</a> blah blah</p>

MathML3's suggestion is to implement it via a MathML <mtable> via some special <mlabeledtr> row (i.e. <mtr> + some extra label cell). I guess this makes sense in the old "standalone spec" paradigm when each MathML formula is just considered an isolated document, so that specialized math editing/rendering/AT tools can properly handle it. But that does not seem very useful in the Web context and it duplicates existing features: You could just use HTML/CSS layout to produce this labelling effect, just like one writes "Figure 1" in HTML/CSS for images (even SVG ones). LaTeXML or Wikipedia do that (e.g. https://en.wikipedia.org/wiki/Fourier_transform#Definition ).

Additionally, the exact positioning of the label is a bit obscure in MathML3 and would have to be rewritten (maybe part of a general tabular math revamp for CSS) to be compatible with browser layout, implementable and testable. It's not implemented in any browsers and AFAIK there is not any plan to do it in the short to medium term. At this point, it does not seem a feature that meets the criteria to enter in MathML Core IMHO.

Currently MathML Core just treats <mlabeledtr> as an <mtr> with display: none label ( https://mathml-refresh.github.io/mathml-core/#tables ) but it is a bit confusing/misleading out of context and if you are only familiar with HTML tables.

Hence the proposal is either to: (1) Remove <mlabeledtr> from core for now (my personal preference) (2) Keep the fallback in core but add some rationale (backward compatibility?, allow users to override the default behavior?)

Last but not least, there is also the usual question of whether we want to deprecate/remove it from the full MathML spec and write polyfills in https://github.com/mathml-refresh/mathml-polyfills.

cc' @bkardell

davidcarlisle commented 5 years ago

I would drop it from core and keep it in full, in a css world the extra label "column" makes things confusing , but in other contexts having an explicit source markup for labels is useful, and dropping the mlabeledtr in any transformation to a document matching the the core spec is likely to be simple enough.

NSoiffer commented 5 years ago

Based on survey results so far, none of the converters generate it, so usage is likely very low. That argues very strongly that it shouldn't be part of core.

The main reason for having this is to make sure the baseline of the label aligns with the baseline of the equation it is labeling. This property is very important to publishers. How can this be done, especially with a multi-line equation (often labeled as 6.1 (a), 6.1 (b), etc?

pdfion commented 5 years ago

As usual, in my view, the point of this construct was to make a close association of a label that might later be referenced with a piece of a displayed formula, so an attempt to allow tying some mathematically significant things together, rather than just to allow a presentation with a formula number, say. If it isn't widely used at all, then remove it from Core by all means. But was not introduced entirely for page layout, though baseline alignment helps provide a good visual cue for an association. It might be argued, I suppose, that other ways of giving document parts IDs are readily available.

bkardell commented 5 years ago

it seems that we have idrefs and labels and links and all sorts of things in the platform that we can use toward these purposes. If it feels like it is replicating something that could be done with the existing platform, it definitely seems like we should either cut it, or find a way to express its very specific implementation directly in known terms. If there is not legacy content, I would prefer to drop it.

NSoiffer commented 5 years ago

I was tasked with getting MathJax stats on this if there were any. I asked, and they don't have any stats on any usage features.

As I mentioned earlier, the point is to get the baseline of the label to align with the baseline of the equation it is labeling. If that can be done in HTML/CSS, then let's drop this from core. To be concrete:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mtable>
  <mlabeledtr id='e-is-m-c-square'>
    <mtd>
      <mtext> (2.1) </mtext>
    </mtd>
    <mtd>
     <mrow>
       <mi>E</mi>
       <mo>=</mo>
       <mrow>
        <mi>m</mi>
        <mo>&#x2062;<!--INVISIBLE TIMES--></mo>
        <msup>
         <mi>c</mi>
         <mn>2</mn>
        </msup>
       </mrow>
     </mrow>
    </mtd>
  </mlabeledtr>
  <mlabeledtr id='e-is-m-c-square'>
    <mtd>
      <mtext> (2.2) </mtext>
    </mtd>
    <mtd>
     <mrow>
       <mi>V</mi>
       <mo>=</mo>
       <mfrac>
         <mrow>
           <mi>l</mi>
           <mi>w</mi>
           <mi>h</mi>
        </mrow>
        <mn>3</mn>
       </mfrac>
     </mrow>
    </mtd>
  </mlabeledtr>
</mtable>
</math>

Should align the baselines as in image

A better example probably would have been something like

   y = ....           label 1
     = ...            label 2
     = ...            label 3

where the exprs have different heights/depths

NSoiffer commented 2 months ago

I scraped 50 ebooks written in PreTeXt. Most are college level textbooks, with some aimed at lower level subjects. PreTeXt uses TeX for the Math and the math is converted to MathML by MathJax. In those 50 books, there are 449,564 expressions, with 26% of them being trivial expressions (a number, identifier, operator, or text). Relevant to this discussion, there were 28,238 mtables, with 69,738 mtrs and 1,405 mlabeledtrs. That's about 2% of the rows. So it does have some use in practice.

dginev commented 2 months ago

To me it appears that the tricky representation issue with <mlabeledtr> is that the classic equation labels are not part of the content of an equation. Rather, they are visual labels that anchor the equation in the context of an outer document (or other host context).

Which is probably why even host languages such as JATS, which delegate all of math syntax to MathML, use their own labeling vocabulary. There is a revealing example in <disp-formula-group>, copying here:

<disp-formula-group>
<disp-formula id="formula-qf-1">
<label>(1)</label>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>...</mml:mrow>
</mml:math>
</disp-formula>

<disp-formula id="formula-qf-2">
<label>(2)</label>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>...</mml:mrow>
</mml:math>
</disp-formula>

<disp-formula  id="formula-qf-3">
<label>(3)</label>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>...</mml:mrow>
</mml:math>
</disp-formula>
</disp-formula-group>

It's a debatable case for removal, but I could imagine arguing either side.

NSoiffer commented 2 months ago

How does JATS deal with labeling rows in a mtable? That's a common case (labeling steps in a derivation where each step is a row or two in a table). You need to align the label with the row.

I can imagine that in the web platform it might be possible to have an id on an mtr, have a label that references that id, and write some JS that figures out the baseline of the mtr so it can do the alignment, and then positions the label accordingly. Does anyone know if this is just daydreaming or is actually achievable?

dginev commented 2 months ago

If we want an authoritative answer, we should probably forward such questions to the JATS group.

To my understanding the <disp-formula-group> approach is the dedicated mechanism, as showcased above. An <mtable> that holds multiple labeled rows may have to be chunked into separate <math> components for that to be compatible.

JATS is not usually concerned with visual rendering details (such as aligning a label to a row), which is something a downstream processor would need to design (e.g. a JATS-to-HTML stylesheet). LaTeXML has a variation on this theme, where an HTML <table> scaffold holds the equation numbers, and each HTML <td> has its own <math> element holding a fragment of the original formula, or its associated label. There are various other ways to render something similar in HTML (CSS grid ought to work well as well).

What I wanted to point out is that the JATS approach appears to want to model the labels as document-level, while still using MathML for all math syntax.