w3c / mnx

Music Notation CG next-generation music markup proposal.
175 stars 19 forks source link

Rich text representation in JSON #345

Open dspreadbury opened 3 months ago

dspreadbury commented 3 months ago

Issue #280 considers whether we could use HTML/XHTML to encode rich text in MNX documents, but it dates from times of yore when MNX was an XML-based format. We don't think it would be a great idea to use HTML/XHTML in our JSON-based MNX format, so we instead need to consider how to approach encoding rich text in a more JSON-friendly way.

Typically scores use a single text font family and a single music font family (though of course there are exceptions), so we'll want to define the default font families to be used for the document as a whole in the global presentation data for the document. It shouldn't be necessary to specify the font to be used for every text item, unless it needs to be overridden. Similarly, when specifying the use of a SMuFL symbol, it shouldn't be necessary to specify the music font to be used, unless it needs to be overridden.

Text in scores is typically either roman, italic, bold, or bold italic. Rarely it may be underlined, even more rarely it may be overlined, and almost never is it struck through. It is more common for text to be enclosed in a box than to be underlined. Our text representation should make the common kinds of text used in scores easy to represent, while providing some flexibility for more unusual use cases.

JSON documents use UTF-8, so it makes sense that runs of text should be encoded in UTF-8. All characters within the Basic Multilingual Plane (U+0000–U+FFFF) can be encoded using UTF-8; characters outside the BMP must be encoded as UTF-8 surrogate pairs (or so says Wikipedia, anyway). New line characters can be encoded using \n.

In order to allow rich text formatting within a run of text, one approach would be to have a text object that contains one or more textChunk objects. Each textChunk defines either a string or a SMuFL glyph, and optionally can define overrides for font family, font style, font size, decoration (enumeration? for underline, overline, strikethrough), and enclosure (enumeration? for border). To change any of the formatting properties for text, a new textChunk object is required, and everything contained within the same text object is intended to be rendered as a single run of text. If the run of text is multi-line, a textChunk can be terminated with a new line (\n).

There are still lots of questions to resolve here:

akulisch commented 3 months ago

Regarding UTF-8 Encoding: Surrogate Pairs are a UTF-16 concept, so you mixed UTF-8 and UTF-16. UTF-8 can encode all of Unicode. U+0000 – U+007F (ASCII) as one byte each, everything from U+0100 upwards as multibyte sequences.

lemzwerg commented 3 months ago

Surrogate pairs are only needed for escaped characters. Honestly, I think this is a very ugly limitation of JSON since it is next to impossible to deduce visually that the representation \uD83D\uDE10 is actually U+1F610.

If it were possible to extend the JSON for MNX I would suggest to either introduce \u{...} or \U... allowing for more than four hex digits so that the whole Unicode range can be represented with a single escape instead of surrogate pairs. However, I guess this is a pipe dream since all the JSON parsers out there would choke on that...

lemzwerg commented 3 months ago

Regarding your questions on font styles (wearing my FreeType maintainer hat):

IMHO we can not get away with the classical four text style attributes. BTW, I think that your observations on the font naming details on MacOS and Windows are dependent on applications and/or UI features that do not implement the current OpenType standard, mostly for backward-compatibility reasons.

Please check the 'name' table documentation and look how 'Name IDs' are constructed. The examples there explicitly mention Minion Pro, BTW.

It might be helpful to examine how the Pango font rendering library (which is widely used in the Unix world) implements both text attributes and font descriptions. There are certainly other libraries that provide similar features.

samuelbradshaw commented 3 weeks ago

There are three patterns I've seen for styling text in code:

  1. Block-level styles; i.e. styling a full object (such as a syllable, or a title); which might include breaking a single text block into a list of separate blocks for styling as described above
  2. Special tags or syntax that surrounds the character(s) being styled
  3. Styling instructions stored separately from the character(s) being styled (example: make characters 5–7 bold, and make characters 6–10 italic)

(1) isn't very flexible if text objects are defined as strings. If all text objects are defined as lists, it becomes more flexible, but is also very verbose. (3) is very flexible (especially when it comes to overlapping styles, which neither (1) nor (2) handles cleanly), but a pain to maintain (if you add a character to the text, you have to update all of the instructions to account for the change).

(2) balances brevity and maintainability with flexibility. I think these are the most common open-source flavors:

As of now, I'm still in favor of (2) for inline styling, and I lean towards something with tags like XML/HTML or BBCode.