w3c / mathml

MathML4 editors draft
https://w3c.github.io/mathml/
Other
58 stars 18 forks source link

Remove/deprecate/simplify the ms element #120

Open bkardell opened 5 years ago

bkardell commented 5 years ago

I've been googling for some time this morning, looking for some example of <ms> use that is somehow not just effectively <math><ms>hello world</ms></math>. For most elements I've looked up, I can find some reasonably simple examples of something that is recognizable as "mathy" content and helps me understand what it does, and what it is for. But, for <ms>, I can't.

Can someone please provide an example demonstrating a real (simple, hopefully) use of this element and why it is useful/necessary to have this particular distinction? If not, can we just drop it from core?

davidcarlisle commented 5 years ago

it essentially displays the same as mtext, but defaulting to a monospace font and surrounded by quotes. so it could in theory be dropped from core but it is a useful distinction especially when using math layout for computer code rather than "pure math" to maintain a distinction between strings which are values and text which is part of the surrounding sentence structure so

str1=foo   or   str1=bar

so the ms foo and bar are values in the expression but is just part of the sentence inlined into the display for layout reasons.

the justification is actually very similar to that of which also displays pretty much as mtext, but numeric values, like string values often have a special role.

On Wed, 17 Jul 2019 at 14:57, Brian Kardell notifications@github.com wrote:

I've been googling for some time this morning, looking for some example of

use that is somehow not just effectively hello world. For most elements I've looked up, I can find some reasonably simple examples of something that is recognizable as "mathy" content and helps me understand what it does, and what it is for. But, for , I can't. Can someone please provide an example demonstrating a real (simple, hopefully) use of this element and why it is useful/necessary to have this particular distinction? If not, can we just drop it from core? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread .
bkardell commented 4 years ago

Thanks David. I'm wondering still though: can you provide an example demonstrating an actual/real (simple, hopefully) use of this element? In other words, not a thing that is showing what it does with foos and bars, but - some actual math content that helps me (or someone else) understand how it serves an actually useful purpose?

davidcarlisle commented 4 years ago

here at work we generate a lot of code from mathml and

123=123.0 is string comparison and false, and 123>=123.0 is numeric comparison and probably true, if you didn't have 123 you would presumably use "123" which displays the same but is harder to use for code generation as it mixes the actual string value of 123 with the string literal syntax, so for example you'd have to work harder for it to be equal to '123' or "123' to be an error. One could argue that core is only focussed on presentation in a browser in which case ms and mn are both pretty much replaceable by mtext, but there are many benefits to having the math in the browser page also be processable as math, and distinguishing numeric and string literals is a more or less universal feature of any programming language that you want to use to process the expressions. On Wed, 31 Jul 2019 at 14:53, Brian Kardell wrote: > Thanks David. I'm wondering still though: can you provide an example > demonstrating an actual/real (simple, hopefully) use of this element? In > other words, not a thing that is showing what it does with foos and bars, > but - some actual math content that helps me (or someone else) understand > how it serves an actually useful purpose? > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > , > or mute the thread > > . >
fred-wang commented 4 years ago

cc @emilio @rwlbuis

Some arguments against ms:

  1. It is only implemented in Gecko, not in WebKit. I'm still really skeptical we should upstream our support to Chromium given the points described here, which make the justification weak for an intent-to-implement.

  2. It is not used a lot in complex math formulas? At least not as string literals? I wonder how many MathML formulae actually use ms and how often authoring tools generate it.

  3. The CSS implementation is broken in RTL mode: https://github.com/mathml-refresh/mathml/issues/126

  4. In general generating text that is not in the DOM is bad for native implementation (need to create anonymous layout nodes, need special handling in the a11y tree etc). This is similar to issues explained for removing mfenced (#2). More specifically here:

    • There is already a precedent, see this Mozilla security bug (now public) that led to the CSS implementation above: https://bugzilla.mozilla.org/show_bug.cgi?id=476547
    • The CSS implementation relies on :before { content: ... } and :after { content: ... } which are known to be a pain for accessibility.

That said, I understand the argument that people would like to keep some kind of "semantic" for string literals. One possibility is to keep ms in Core but says that it does not add automatic quotes (i.e. behave like mtext). People could write a CSS polyfill for non-RTL mode using the :before/:after rules or a JS polyfill that actually change the text content of ms.

davidcarlisle commented 4 years ago

I doubt ms is used much in "classical" mathematical texts but it is very natural when documenting programs or pseudocode etc where a "string" is usually just as primitive datatype as a "number"

but... I understand the concerns re picking up attributes.

One possibility is to keep ms in Core but says that it does not add automatic quotes (i.e. behave like mtext).

To make this slightly more distinguished from mtext it could default to monospace?

Also would the css concerns be as great if you dropped support for the attributes and always used a straight double quote " at each end? (I suspect it wouldn't help as much as I'd like as you would still be inserting text but I thought it worth asking...)

fred-wang commented 4 years ago

To make this slightly more distinguished from mtext it could default to monospace?

Adding default rule font-family: monospace; or text-transform: math-monospace; would work and not cause any issue.

Also would the css concerns be as great if you dropped support for the attributes and always used a straight double quote " at each end? (I suspect it wouldn't help as much as I'd like as you would still be inserting text but I thought it worth asking...)

Supporting only the default value still require to modify the tree which is not great for text selection, copy & paste, search or accessibily (I suspect Firefox's implementation is broken with that regard). However, it's true it allows to fix #126 since both quotes are the same.

davidcarlisle commented 4 years ago

I think I am coming round to the idea of dropping ms, Dropping the element entirely is easy to understand (and document) but if we put in a cut down ms that doesn't add text then

<ms>wibble</ms>

will or will not add quotes depending on whether it is interpreted by a full mathml system or mathml core.

So while given a choice I'd rather have ms, if it is causing you issues and time that would be better spent elsewhere, then this is a battle I don't want to fight:-)

fred-wang commented 4 years ago

I think I am coming round to the idea of dropping ms, Dropping the element entirely is easy to understand (and document) but if we put in a cut down ms that doesn't add text then

<ms>wibble</ms>

will or will not add quotes depending on whether it is interpreted by a full mathml system or mathml core.

This is the same for all elements removed from Core. The idea is that someone would write a CSS or JS polyfill to add quotes in the browser. It can even do feature detection to check if the CSS/JS has to be added.

So while given a choice I'd rather have ms, if it is causing you issues and time that would be better spent elsewhere, then this is a battle I don't want to fight:-)

I think the CSS-implementation is simple and more secure/cleaner than direct C++ implementation so not a big burden for maintenance. However, it does not work in RTL mode and has all the bugs already mentioned for mfenced and it's not clear how we would fix them. So it personally does not really make sense to me to keep such a broken implementation. I'd prefer to get rid of ms if the element is not fundamental.

fred-wang commented 4 years ago

Consensus from 2019/11/11: Keep ms in MathML Core but make it behave like mn and add a note to tell people to either put the quotes in the contents or add CSS to style it via before/after.

NSoiffer commented 4 years ago

@fred-wang : seems like all but "needs polyfill" can be removed. The CG seems to have resolved this, and there are commits wrt to the spec and testing. Is there more that needs to be done in those areas?

fred-wang commented 4 years ago

We can close this but a polyfill is still needed at https://github.com/mathml-refresh/mathml-polyfills/ if this feature is keep in MathML Full too.

https://mathml-refresh.github.io/mathml-core/#string-literal-ms has a CSS suggestion in a non-normative note, but it is known to be limited and cause issues, as discussed above. A JS polyfill that generates mo (similar to the mfenced one) would be more appropriate.

fred-wang commented 4 years ago

"seems like all but "needs polyfill" can be removed."

sorry, I misread this. polyfill should be keep indeed.

NSoiffer commented 4 years ago

Reopened because we still need a polyfil.. I'll write one this weekend and then close it.

NSoiffer commented 4 years ago

Adding this from the spec for ms to the issue because no one mentioned this difference (and I didn't remember it):

The content of elements should be rendered with visible "escaping" of certain characters in the content, including at least "double quote" itself, and preferably whitespace other than individual blanks. The intent is for the viewer to see that the expression is a string literal, and to see exactly which characters form its content. For example, double quote is " might be rendered as "double quote is \""

So the polyfill needs add escaping also.

NSoiffer commented 4 years ago

I've written a polyfills for this: https://github.com/mathml-refresh/mathml-polyfills/tree/master/ms

davidcarlisle commented 1 year ago

Adding this from the spec for ms to the issue because no one mentioned this difference (and I didn't remember it):

The content of elements should be rendered with visible "escaping" of certain characters in the content, including at least "double quote" itself, and preferably whitespace other than individual blanks. The intent is for the viewer to see that the expression is a string literal, and to see exactly which characters form its content. For example, double quote is " might be rendered as "double quote is ""

So the polyfill needs add escaping also.

https://w3c.github.io/mathml/#presm_ms

I think the Full spec should highlight here that this won't work in core.

If the css suggested in the core spec is available, quotes will be added, but otherwise not, and escaping would require a javascript polyfill.

I fear we have to say that if you are only generating the mathml, and not in control of doucument level polyfills you may prefer to use mtext rather than ms and with explicit quotes in the content

dginev commented 1 year ago

Question:

I experimented with relying on quotes in CSS via ::before/::after, as @fred-wang seems to have suggested above and noticed a small issue. Here's an example snippet:

* try live at: https://jsfiddle.net/rxs6vofn/ ```html 123 = 123.0 ```

I see that the content inserted ::before and ::after does not add extra width to the mtext in Chrome - the quotes show up above/under the mtext. At the same time, this technique seems to work fine in Firefox. Is this a known issue?

fred-wang commented 1 year ago

@dginev Chrome blockifies the children of elements with math display ( https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/style/computed_style.h;l=1681;drc=c48b366d76685c392ee294e74f46462de08697bd ) so that suggestion is not going to work (I guess it worked in the past before we fully implemented token elements). A workaround is to normal fallback to display block with ms { display: block } but that will likely cause more issues (e.g. ink metrics no longer used, specific baseline etc). I recommend instead to use a JavaScript polyfill that places the actual quotes in the DOM ; and to remove that suggestion from the non-normative note.

dginev commented 1 year ago

@fred-wang thank you for clarifying the display details, I hadn't yet learned this. Good to keep in mind. A javascript polyfill is certainly always an option.

But since we're talking about mtext and HTML, I also have an alternative in mind that works in both browsers using only CSS. It's via the usual trick of adding an extra span to the scaffold. So maybe that is a useful addition to the non-normative ideas for replacing <ms> - or is there a chance it will stop working in the future?

snippet:

```html 123 = 123.0 ``` try live at https://jsfiddle.net/eauv6qxm/
davidcarlisle commented 1 year ago

@dginev

I also have an alternative in mind that works in both browsers using only CSS

I'm not sure it really meets the use case though. ms is usable in non css environments (pdf tagging, Office, ...) If you are transforming to the restricted version in Core you may as well simply add the quotes.

The version with embedded html and relying on css doesn't seem to gain anything: It can't be the original form used in the full version, and seems to be a more complicated target than needed for a polyfill compared to a polyfill that simply adds the quotes to the content.

dginev commented 1 year ago

@davidcarlisle Certainly. But my own usual setting is one where I have a non-javascript polyfill-like application at hand (say latexml), so a declarative alternative in HTML5+CSS is often more tempting than carrying around additional javascript assets.

As to why I appear zeroed in on the CSS open-quote and close-quote properties - it's internationalization. I was hoping I could combine my snippet with a CSS rule for math {quote: auto; } and then be able to shift the quoting style based on the HTML lang attribute, so that the same formula gets quoted differently when in a document written in French, English or Bulgarian.

I wouldn't have gone out of my way to suggest introducing such capabilities, but I had used them in CSS in the past, so it looked like a nice feature to bring into the <math> realm.

davidcarlisle commented 1 year ago

As to why I appear zeroed in on the CSS open-quote and close-quote properties - it's internationalization. I was hoping I could comine my snippet with a CSS rule for math {quote: auto; } and then be able to shift the quoting style based on the HTML lang attribute, so that the same formula gets quoted differently when in a document written in French, English or Bulgarian.

the original motivation for ms was exactly to stop such localisation and make it explicit that this was a string literal (as in programming languages) so typesetting and language specific quotes not wanted.

There are of course use cases for having language sensitive quotes within a natural language string in mtext as you show, but I don't think that is related to ms.

dginev commented 1 year ago

a string literal (as in programming languages) so typesetting and language specific quotes not wanted.

Interesting distinction. Is it really that easy? Some examples that come to mind:

  1. a string literal in SQL is single-quoted, as in:

    SELECT * FROM Table WHERE FirstName = 'Deyan'
  2. The usual suspects (e.g. Javascript) use the double quotes as in

    var firstName = "Deyan";
  3. Some recent languages, such as Rust, make a distinction of raw string literals which have larger quoting delimiters, as in:

    let embellished_name = r#"Deyan "Newcomer" Ginev"#;

So - to me - localization considerations could be relevant to both natural and programming languages. Of course, one may want to avoid the question entirely, but isn't that the conceptual domain of <cs> rather than the presentation-near <ms> ?

davidcarlisle commented 1 year ago

Some examples that come to mind:

yes sure, all of those can be marked up as ms with suitable lquote and rquote, and all of them require that that quote choice is not subject to the natural language used in the host document.

No markup perfectly catches every nuance, but this is why ms was added.

dginev commented 1 year ago

yes sure, all of those can be marked up as ms with suitable lquote and rquote

Right, if that was supported by the renderer. And actually a simple extension to lquote and rquote could have allowed the open-quote/close-quote values and remained compatible with CSS-based localization.

I was trying to find out if the CSS ::before/::after rules with open-quote/close-quote are good declarative vehicles to substitute that capability through mtext (and gain internationalization for pseudocode literals), and was fishing to see if @fred-wang would consider it reliable to use in browsers moving forward, maybe in the formulation with an extra span.