bkardell commented 5 years ago

Hello TAG Friends!

I'm requesting a TAG review of:

Name: MathML Core
Specification URL: https://mathml-refresh.github.io/mathml-core/
Explainer (containing user needs and example code)¹: https://github.com/mathml-refresh/mathml-core/blob/master/docs/explainer.md
GitHub issues (if you prefer feedback filed there): https://github.com/mathml-refresh/mathml/issues
Tests:
- core
- Proposed exposed CSS properties (not necessary to expose to authors initially, but we think good to)
  - math-script-level and math-style
  - text-transform values
Primary contacts (and their relationship to the specification):
- @fred-wang (editor/implementer)
- @rwlbuis (implementer)
- @bkardell (advocate)

Further details:

Relevant time constraints or deadlines: None specifically beyond our goals to fully upstream an implementation by October 2020 We'd like to continue to progress this long-overdue work.
[X] I have read and filled out the Self-Review Questionnaire on Security and Privacy. (the assessment is below).
[X] I have reviewed the TAG's API Design Principles
The group where the work on this specification is: MathML Refresh CG
Links to major pieces of multi-stakeholder review or discussion of this specification:
- Much of this specification is clarifying with modern rigor a subset of what was both specified in MathML 3 and already shipping in two browsers. Thus far, both WebKit and Firefox have participated, accepted patches and aligned with MathML Core. Our issue tracker shows work and engagement from multiple vendors, including lots of advice and questions from Googlers. Our Intent to Implement also includes encouraging responses.
Links to major unresolved issues or opposition with this specification:

You should also know that...

MathML Core comes from previous TAG review of MathML specs/plans. Among our primary aims is to establish an interoperable and widely agreeable starting point for discussions that have been created by MathML's largely unique history. We've taken very much history, commentary (including TAG's) and challenges into account. We've taken great pains to balance many things: Making this as small as possible, align as directly as possible with the rest of the platform (including DOM and CSS), eliminate unknowns and surprises, apply learnings from actual implementations (firefox, webkit, office and TeX), rigorously specify and remain useful and compatibile, improving the many documents that already exist.

We'd prefer the TAG provide feedback as (please select one):

[ ] open issues in our GitHub repo for each point of feedback
[ ] open a single issue in our GitHub repo for the entire review
[x] leave review feedback as a comment in this issue and @-notify @bkardell @fred-wang @rwlbuis

Security Questionnaire...

What information might this feature expose to Web sites or other parties, and for what purposes is that exposure necessary?
- This feature allows some OpenType parameters from the MATH table to be tested via MathML constructions, similar to what is done in the corresponding WPT tests. These can allow a third-party to detect whether a known math font is installed. This is non-similar to any other non-sensible font data used for text layout, though.
- The MathML href attribute can be a potential risk of third-party detecting visited websites. We expect that the same mitigation as SVG/HTML links will be performed by implementers.
Is this specification exposing the minimum amount of information necessary to power the feature?
- Yes, no information is exposed in new ways.
How does this specification deal with personal information or personally-identifiable information or information derived thereof?
- n/a - this is simply about the rendering of a kind of text in a tree and setting it inline with other DOM/CSS expectations.
How does this specification deal with sensitive information?
- It doesn't.
Does this specification introduce new state for an origin that persists across browsing sessions?
- No
What information from the underlying platform, e.g. configuration data, is exposed by this specification to an origin?
- Because it is based on integration with the rest of the platform, we believe nothing new, beyond what is mentioned in the first question.
Does this specification allow an origin access to sensors on a user’s device
- No
What data does this specification expose to an origin? Please also document what data is identical to data exposed by other features, in the same or different contexts.
- Nothing
Does this specification enable new script execution/loading mechanisms?
- This specification adds new (for MathML) script execution mechanisms in event handler attributes (onclick, etc) or href="javascript:..." links in order to align with normal platform expectations - these already exist for HTML or SVG. It does not add new script loading mechanisms.
Does this specification allow an origin to access other devices?
- No
Does this specification allow an origin some measure of control over a user agent’s native UI?
- No
What temporary identifiers might this this specification create or expose to the web?
- None.
How does this specification distinguish between behavior in first-party and third-party contexts?
- The specification itself does not make any distinction. MathML can be embedded into HTML and in SVG (via the foreignObject element) and it is expected that the same existing restrictions as HTML/SVG will apply for third-party contexts.
How does this specification work in the context of a user agent’s Private Browsing or "incognito" mode?
- No differences.
Does this specification have a "Security Considerations" and "Privacy Considerations" section?
- No, but we will be adding them, largely to reflect what we have listed here.
Does this specification allow downgrading default security characteristics?
- No

bkardell commented 5 years ago

Note: For anyone who feels like they'd like more information that they can consume pretty easily, we have a ~30 min presentation from our recent hackfest where @fred-wang breaks a lot of this down in pretty complete and understandable ways (https://www.youtube.com/watch?time_continue=79&v=Q8Z1D2i61j8)

torgo commented 5 years ago

We will try to schedule some breakout time on this for this week. If not, this will have to get bumped to the f2f.

torgo commented 4 years ago

Discussed at our f2f with @bkardell. The hypothesis is that MathML core addreesses many of the issues raised by the previous TAG review. We need to do further work to confirm this.

hadleybeeman commented 4 years ago

Also, looking at this explainer -- it does a good job of explaining how you got to where you are, and how you've decided on the approach you took. It doesn't though explain concisely how MathML core works. It might be useful to add that in.

alice commented 4 years ago

Some progress on Hadley's comment happening in @bkardell's WIP gist. I've left some comments there.

bkardell commented 4 years ago

PR on the explainer in progress at https://github.com/mathml-refresh/mathml-core/pull/20 will be discussed in the next cg meeting

bkardell commented 4 years ago

The MathML-Core Explainer has been merged with suggested changes/based on discussions - would suggest using that rather than any gist..

alice commented 4 years ago

The explainer is in really good shape now, thank you!

A couple of general comments from discussions I've had with other TAG folks:

Overall, there is some tension in the choice of core elements between whether the elements are expressing semantics or layout. e.g. <mn>, <mi> and <mo> seem to be expressing semantics, while <menclose> is purely layout.
- <mrow> is somewhere in between - mostly useful as a "group" directive, but the notation somewhat makes the intended use feel like a layout directive. Is there any chance that this could be generalized to something like <mgroup>?
What is the meaning of "legacy compat" in this context? It doesn't seem to be called out in the spec.
How is this parsed? Do all the elements need to be explicitly closed?

bkardell commented 4 years ago

What is the meaning of "legacy compat" in this context? It doesn't seem to be called out in the spec.

Yeah, I guess this could have been clearer. Basically, they are things which, in an ideal world, would just be solved by existing platform answers. However, a significant amount of real world legacy content uses it, so we think the right thing to do is to not break that - to map to the existing and include it in the spec as deprecated. HTML has several similar things - <font> and <center> for example.

How is this parsed? Do all the elements need to be explicitly closed?

This is all specified in the HTML Parser itself - which is complicated, but I think the special cases you are probably interested in will be listed here: https://html.spec.whatwg.org/multipage/parsing.html#the-stack-of-open-elements

Overall, there is some tension in the choice of core elements between whether the elements are expressing semantics or layout.... [snip] Is there any chance that this could be generalized to something like

There are probably lots of choices that would be made differently if we were starting from nothing today - I think the challenge here is that the aims are to normalize and well-define what is necessary to keep millions of existing mathml contents in play - <mrow> is pretty central to that. We could potentially propose to also have an <mgroup> but so far we've tried to not add new invention of that sort here for the initial definition of core.

torgo commented 4 years ago

Hi @bkardell. Hadley & I just re-reviewing now in the context of our TAG f2f. Are you planning to issue any changes to the explainer based on what you've written above?

hadleybeeman commented 4 years ago

Hi @bkardell! The explainer has come a long way since we first looked at it. Well done from me too!

It occurs to me that the legacy situation you're dealing with here (and understandably! You've explained that well to us) may leave developers confused about when MathML will follow the web platform (like CSS) and when it won't.

Do you have any plans to produce developer-focused documentation? If so, can you cover that off somewhere, so that it's clear to them how to think about MathML?

bkardell commented 4 years ago

@torgo and @hadleybeeman

It occurs to me that the legacy situation you're dealing with here (and understandably! You've explained that well to us) may leave developers confused about when MathML will follow the web platform (like CSS) and when it won't.

Perhaps somewhere my emphasis has been misleading? MathML-Core aims to always follow the platform. I have added and linked a description/note in the explainer about this to hopefully be more clear.

Do you have any plans to produce developer-focused documentation? If so, can you cover that off somewhere, so that it's clear to them how to think about MathML?

The CG can help review and update documentation, yes. We will, for example, definitely link up the DOM interfaces, and add any deprecation notes and recommend the underlying CSS properties for things like legacy compat elements/attributes.

That said, I think it's worth to noting: the current state of things is simultaneously confusing, and neither documented, necessary, or even desirable by anyone. A lot of what MathML-Core is doing here is just making that better: Making the stuff that developers just assume by nature of being in the platform be true. MathML-Core doesn't add new surprise in this regard, it removes it.

Previously, for example, documentation didn't tell developers that they could set the color property in CSS, and that that would work - but that attempting to set the color via the .style property would throw (there are a lot of examples like this, this one is just easy to explain/point to). Developers just expect that to work on any element, really... a lot of documentation won't even mention it. It's just a given. Worse if it doesn't work, and that fact isn't written down or specified, which is the case now.

Similarly, a bunch of spec work in MathML Core is around actually just defining how MathML actually fits in CSS, so that it can work better (or, at all in some cases). @bfgeek especially has been really good about raising these issues and helping making sure we are resolving these things 'with the platform' so that we aren't adding new complexities or surprises, but removing them. As we complete, anything actually novel to MathML will certainly be added to MDN.

alice commented 4 years ago

Hi Brian,

Thanks for the clarification above.

The improved explainer gives us a lot more context, so we're going to spend a little time looking through the spec now that we have a better idea of what it's doing.

We'll ping this thread again if/when there are any questions from that.

cynthia commented 4 years ago

Non-technical, process question - why is this in a non-w3c controlled GH organization?

alice commented 4 years ago

We had a skim of the spec today in our virtual face to face.

The spec looks extremely thorough and well thought out!

Only a few things stood out to me (I didn't go into depth on each of the elements, or the box model):

There are a number of open issues in the DOM and JavaScript section, which look interesting but seem to be mostly deferred. Are there any of those we should take a look at?
The list of elements differs slightly from the list in the Explainer - <maction>, <mprescripts> and <none> aren't mentioned in the Explainer, and <menclose> doesn't seem to be in the spec.

That said, there doesn't seem to be any reason to keep this review going, unless there are specific things in the spec or associated issues you'd like us to take a look at, so we're going to propose closing - please let us know if you want more feedback from us!

bkardell commented 4 years ago

The spec looks extremely thorough and well thought out!

Thanks! It's had, and continues to have a lot of work, details, improvements like code examples added.

The list of elements differs slightly from the list in the Explainer

This is my fault, I let them get out of sync because of open issues on the spec from a while back. I believe we have correct those with a pull from this week.

There are a number of open issues in the DOM and JavaScript section, which look interesting but seem to be mostly deferred. Are there any of those we should take a look at?

Yes, we are very interested in ultimately aligning with the platform to the very greatest extent possible. As of our CG call today we have begun to label these things as specifically being deferred to next steps so that we can be clearer and the spec convey more properly about what the immediate goals of initial stable core implementations is. I think what is important to me, as I said when I met with TAG was in your opinion on how we have broken the problems out and are applying good principles and practical lessons here. We asked many critical questions here on DOM/JS early on. Some of these (shadow dom, custom elements) are deferred, but that served as guiding factors while we were doing other things: Making sure we had first steps (elements have IDL in the first place, we have a concept of identifiable unknown elements, we've considered a future with Shadow DOM in discussions where this would be important like linkable elements and so on). So, I am interested in whatever aspects of this TAG can provide thoughts on: Is the level of decisions appropriate? (IE, are there things that are deferred that shouldn't be or vice versa? Are there unreasonable answers?) Or, on things like wanting to explore custom elements/shadow DOM and things and normalize the platform -- are they reasonable aspirations?

Most obviously, of course, we are interested in whether there is anything actively concerning. If there is not and you'd like to close the issue, that is ok with us. However, we're very open to and interested in whatever additional thoughts the TAG is able or willing to comment on regarding the above.

bkardell commented 4 years ago

There is one thing that the CG would very much like TAG's input on: we are struggling with specifically what to do with links.

MathML was designed with an assumption that "someday, everything can have an href/be a link" was the future. So, the pre-core MathML specs allow any MathML element to have an href.

These days we understand that links are quite complex with regard to security, idl, integration with CSS, focus management, role, etc. In most browser implementations of MathML, token elements can be hyperlinks, and the vast majority of hyperlinks in existing MathML are on token element, or on mrow.

We are struggling between 3 choices:

Simply say MathML doesn't have hyperlinking elements for Core Level 1, and take advantage of the ability of token elements to contain HTML by providing authoring advice to use an <a href> or write JavaScript code to handle the click event.
Choose a safelist of MathML elements to support an href attribute. This would be either all token elements, or all token elements plus mrow (together these are nearly all uses we could find).
- This would involve a longer review process to reach consensus on things like what the default tab index of these elements should be and how it should be specced (would it entail a PR on the HTML spec to add these elements to the list of focusable elements?)
- It would also make it tricky to support Shadow DOM on these elements, since hyperlinks can't have a shadow root.
Add a new <ma> hyperlink element which is mrow-like. The arguments here are more or less the same as for w3c/mathml#1, but they also solve for groups and still allow mrow to support Shadow DOM someday.
- This seems like a good trade-off, but how hard would it be to get implementers and authors to buy in to this change?

NSoiffer commented 4 years ago

To be more concrete about why leaf elements are common in MathML, take a look at the NIST Digital Library of Mathematical Functions (DLMF) which uses MathML. If you go to almost any page with math, many of the leaves are linked to their definitions (you need to mouse over to see that). E.g., on the page about binomial coefficients, the 'm' and 'n' point to a definition about integers and the 'z' to one about complex numbers. On other pages, function names for 'cos' and 'sin', along with esoteric functions point off to their definitions (and alternative representations). I believe the DLMF site's usage of links is pretty common for encyclopedic sites.

When the leaf element is not the appropriate place to put a link, mrow is often used. mrow is the span of MathML. It would be really nice in the future to be able to attach a Shadow DOM to it. Ideally, I'd like to see attaching a Shadow DOM to mrow when href is not present and not allowing the attachment if it is present. However, I lack the expertise to know if such a conditional option is within the realm of possibility even if attaching a Shadow DOM to most MathML elements is deemed legal in the future.

Note: MathML is used in contexts outside of the web, so the MathML full spec allows a web schema or other schema to be defined. <a> could be added to the full spec, but the name doesn't fit the existing naming scheme for MathML elements and enforces all of the web semantics attached to it on it, which is why ma is the proposed new element name. The full spec would like want to mention only the use cases relevant to MathML. MathML Core, being tied to the Web would define ma as being equivalent to a.

alice commented 4 years ago

I'll leave some comments on https://github.com/mathml-refresh/mathml/issues/125 regarding our thoughts on links, just to keep that discussion in one place.

Regarding the other questions - we think you made good decisions about what to defer, and that in particular custom elements/shadow DOM would need some very careful thought about use cases and implications to guide design.

We're going to close this for now - thanks for your patience and please let us know if we can help in future!

alice commented 4 years ago

Just noticed https://github.com/mathml-refresh/mathml/issues/125 hasn't had any updates for a while, so to avoid zombifying that issue I'll just leave my thoughts here.

We talked about this as our breakout today, and generally felt like (1) sounded like the minimal option, assuming <a href> can reasonably be nested within MathML.

We weren't sure exactly how that worked - is it limited to being on "leaf nodes", as @NSoiffer alluded to above?

It seems like, if that is the case, then at least the cases linked to in @NSoiffer's comment would still be implementable using HTML <a> embedded in MathML?

To guide future work on option (2) or (3), it would be good to have a better understanding of how authors have used href in MathML in the past - what are they linking to, and at what level, and why?

bkardell commented 4 years ago

We weren't sure exactly how that worked - is it limited to being on "leaf nodes", as @NSoiffer alluded to above?

That's right, these make up the vast majority of links in practice though (leaf nodes)

it would be good to have a better understanding of how authors have used href in MathML in the past - what are they linking to, and at what level, and why?

There are uses of links on mrow, which is kind of the generic grouping container for MathML. In the same examples @NSoiffer was mentioning - if you wanted to link to an expression rather than individual tokens, you need some other facility (like making mrow linkable or adding a link element for math that is dedicated specifically to linking instead of dual purposing mrow). Personally, while I am not keen on adding new elements, trying only to explain and subset in Core - I think in the long run a single dedicated linking element is the better/more consistent/least risk choice here.

cynthia commented 4 years ago

(Disclaimer: Math illiterate saying random things)

"mrow href" feels quite restrictive, due to the commutative property of equations. (The restrictiveness is based on the fact that it requires related parts of a given formula to be spatially adjacent) Instead of individual leaf nodes or tokens, it does feel like the long term solution might be to implement something like an image map equivalent that lets you hyperlink a given semantically related set of components of a formula rather than requiring authors to manually update n leaf nodes that point to the same link.

That said, we would like to see more complex use cases.

alice commented 4 years ago

Going to reopen this to keep talking about links - we are still happy with the rest of the spec!

alice commented 4 years ago

@torgo and I chatted about this just now in our virtual face to face.

We recall a discussion with @bkardell about link semantics, and the difficulty of weighing up the importance of backwards compatibility with existing code, and future-proofing for allowing attaching shadow roots.

It seems like backwards compatibility does need to take precedence here, i.e. the existing solution of "anything can have an href". This does mean there will be some extra work necessary at some point to explain when it is possible to attach a shadow root (since <a> elements are not allowed to have shadow roots attached), and to explain what happens if someone tries to set an href on an element which already has a shadow root.

We can see two options for the latter:

the href takes precedence, and turns the element into a de facto <a> element, causing the shadow root and any contents in it to be removed; or
whichever is attached first takes precedence, so if a shadow root was attached, the href attribute has no effect.

All that being said, we have seen that the MathML group has a much better knowledge of the domain than we do, so we would be happy to be guided by whatever decision they made on this topic.

Edit to add: We could also imagine the "href on anything" being a legacy-only feature, with authors advised to use a new <mrow>-like link element going forward.

w3ctag / design-reviews

MathML Core #438

Security Questionnaire...