w3c / mathonwebpages

Repository for the W3C "Math on the Web" Community Group
https://w3c.github.io/mathonwebpages
19 stars 13 forks source link

[text-based notation] Unicode Technical Note 28: UnicodeMath #22

Open js-choi opened 6 years ago

js-choi commented 6 years ago

The CG’s research webpage on text-based math standards currently lists LaTeX, LibreOffice, AsciiMath, MathSON, Mathematica / MATLAB / Octave / mathjs, Maple, and Microsoft Excel as prior art in standards for mathematics in plain text.

I would like to bring up the existence of a work that may also be of interest. This one is from one of the authors of standard plain text itself: Unicode Technical Note 28: UnicodeMath, A Nearly Plain-Text Encoding of Mathematics, aka UTN 28. It aims to be a “nearly plain-text” linear format in a uniquely concise manner, drawing from Unicode’s broad repertoire of characters, instead of restricting its syntax to US-ASCII symbols and English keywords.

As a Unicode Technical Note, UnicodeMath is not part of the Unicode Standard, even though it is the Unicode Consortium publishing it. Its author, Murray Sargent III of Microsoft, is one of the scientists who is involved in Unicode’s plain-text encoding of the symbols found in mathematics. UnicodeMath was formerly named “Unicode Nearly Plain Text Encoding of Mathematics”, but Sargent recently changed its name. It is descended from a 1970s math notation language for microcomputers and today is implemented in Microsoft Word.

It may be worth adding UnicodeMath to the CG’s webpage on text-based math standards. The CG may also wish to consider it as an interesting piece of prior art, a window into how its author views the function of math characters encoded in Unicode…especially given how Unicode is a fundamental standard of the web itself.

js-choi commented 6 years ago

It also looks like @pkra's independent tools-for-math-on-the-web list already included UnicodeMath under the title “Microsoft Office linear format”.

As an aside, Sargent’s article “Nemeth Braille—the first math linear format” is interesting in its own right, especially given that Braille is now a machine-encodable script in Unicode. See also https://github.com/KaTeX/KaTeX/issues/593#issuecomment-269389168.

pkra commented 6 years ago

Thanks for filing the issue. I don't remember why Jos didn't add it back in the day. The task force has been dormant for a while now and should it wake up, it probably add a note on it one way or another.