w3c / mnx

Music Notation CG next-generation music markup proposal.
179 stars 18 forks source link

Clear definition of CWMNX rendering model needed #13

Open joeberkovitz opened 7 years ago

joeberkovitz commented 7 years ago

We need to develop a hard-edged definition of those aspects of graphical CWMN rendering which can be captured in a spec, without tying the hands of implementations to compete in terms of quality.

Some of these aspects may include:

Obviously there are lots of these and so this is a broad area of effort.

mdgood commented 7 years ago

One example of the registration point issue was documented in MusicXML issue 5 regarding rest positioning.

joeberkovitz commented 6 years ago

The discussion in Anaheim in January 2018 suggested that we might want to go as far as actually specifying a prescribed horizontal layout algorithm for CWMNX, one which would be able to handle justification and stretching/compression. In such a scenario, detailed X positioning would be expressed in terms of relative deviations from the prescribed layout. This would be less brittle than the current MusicXML measure-x positions, which most notation engines cannot make use of since they only apply to one specific instance of the score's layout.

joeberkovitz commented 6 years ago

Here's a link to slides for my forthcoming April 11 Frankfurt presentation on this topic:

https://docs.google.com/presentation/d/1UENOaOSboNjdImDhmxbS1TrcM5gnJoqSOXb7w1lcVCY/edit?usp=sharing

From my email on the subject:

I hope that no one will take this as a prescription to approach the subject in one particular way. However, I believe that beginning with a common vocabulary and perspective can help us examine the larger issues in play, even if we are not agreed on the details.

In other words, this is just a starting point for our meeting discussion, and we're not trying to agree on all these details yet -- rather, we're trying to explore just how far the CG should go.

mscuthbert commented 6 years ago

Hi! Looking great -- I think that in going forward, if approaches 3-5 are taken, we should have the examples use style-classes instead of (or in some cases in addition to) individual style elements. I think that in general we should be encouraging consumers and producers to assign recurring layout principles to styles and perhaps even standardize class names for certain likely-to-recur-across-clients layout decisions (such as default note spacing for each note type; and other elements that currently reside in the <defaults> tag, such as line width, etc.).

mdgood commented 6 years ago

I've added John Gourlay's 1987 paper on "Spacing a Line of Music" to the MNX repository in a new "references" folder, with the permission of the author. You can find it at:

https://github.com/w3c/mnx/blob/master/references/Gourlay%20-%20Spacing%20a%20Line%20of%20Music.pdf

Gourlay's analysis informed the Frankfurt presentation and is the source of some of the presentation's terminology. The co-chairs have found this paper to be helpful reference material, but it does not represent our view of the layout world, either individually or as co-chairs.

jsutdolph commented 6 years ago

Good work @joeberkovitz ! This does look like the right direction.

Some comments on the individual slides.

  1. I am surprised to see the vertical purple line aligned on the stem rather than centred on the right notehead which is the musical 'beat position' of this chord - ie where lyrics and notations are placed.

10, 11, 28 etc. I don't see you referring to the blocking width having a system-wide effect so as to create a more even spacing (viz. Behind Bars, Gould, 'Spacing symbols' p.41 onward)

  1. It seems back to front for the score to tell the renderer the blocking width. The renderer has to calculate this for itself because...

  2. I think the blocking width is determined by the rendering algorithm. The score cannot record this information because it depends on the sizes and packing of the symbols in the renderer. I think 24 perhaps answers this.

  3. I don't think it is useful to have the concept of an 'ideal' width. The width assigned to a note depends on the values of all the other notes in the system. eg whole notes will be placed much closer in a system that contains only whole notes, compared to a system that contains eighth notes

  4. Are you saying that the exact layout algorithm is dictated by MNX? Do you envisage just one or multiple algorithms? It would be extremely complicated to exactly specify a layout algorithm.

cecilios commented 6 years ago

@joeberkovitz Great work! Just a question. I understand that what you call 'sims' or 'simultaneities' are just beatlines. Is that right? In that case, would it be better to name them just beatlines for clarity?

joeberkovitz commented 6 years ago

Thanks for the great comments @jsutdolph.

I've generally seen the left side of a note on the "regular" side of a stem as aligned with the beatline, not the notehead center -- for example, one sees simultaneous cue notes and regular notes aligned at this point (examples can be seen in the Behind Bars section on cues). Lyrics and articulations are indeed horizontally centered around the vocal-part notehead, but I see that as a secondary alignment working off the notehead, rather than the beatline position (for instance, what if another part contains a wider or narrower notehead?). No dogma though... what's the CG's wider view on this point? One could certainly formulate things differently.

I think that for 10/11/28 etc. you are speaking about adjusting the dimensions of boxes based on the system's overall density, so that one doesn't have dense boxes in a sparse layout or vice versa. In this model, intra-box distances also become "ideal" in a sense. Definitely a good engraving practice, just a level of detail beyond the rough outline I offered.

18/19: you are right that slide 24 answers this question. I too hope we do not wind up with the score telling the renderer the blocking width of every event, but this may still be a valuable override on the renderer's behavior.

20: I think the point of the ideal width is that it is only "ideal". What you are bringing up is the fact that the casting-off process may place more or fewer measures into a system, based on the symbol density in those measures, which then squishes or stretches all the ideal widths. I think the extent to which we standardize this kind of thing (in this case, attempting to achieve uniform symbol density) is a choice the CG needs to make.

  1. I'm not saying anything definite about how far MNX goes in specifying algorithms -- how much specificity we "dial in" is very much the conversation that the chairs hope we'll get into next. And it is not an all or nothing thing: even if we standardize layout without all of the engraving "smarts" you brought up, the part of layout that we choose to standardize may deliver increased portability and quality.
joeberkovitz commented 6 years ago

@cecilios I would put it this way: sims are not beatlines, they are spatial intervals between beatlines. A sim has a width in the layout, but a beatline does not.

cecilios commented 6 years ago

@joeberkovitz Thank you. I got it.

notator commented 6 years ago

Thanks @joeberkovitz for starting an open debate on this topic. Its obviously a difficult one, but maybe we can get to some kind of standardizable consensus. We may even get agreement in areas that are not even on the table yet.

I actually agree with @jsutdolph on all the points he makes above. :-) Additionally:

Slide 7: as @jsutdolph pointed out, justification is best done system-wise, not just measure-wise, so I think the fixed point might be better at the left edge of the system.

Slide 9: (I would also prefer the vertical line to be at the x-centre of the rightmost notehead.) I was surprised at the simplicity of the box model (also in John Gourlay's 1987 paper). In the model I use, the boundary of a sim is a convex envelope that follows the union of the bounding boxes of all its component characters and lines, surounded by a hairline. A sim's components include the all the components of all the chords and rests that are synchronous on a system, even if the sim contains a chord that has had to be shifted sideways because there would otherwise be a collision with another chord on the same staff. So there are no "layers" as in slide 11. The sims are constructed first, then shifted sideways, as a block, to their final position. This means that they can often be closer together than they would be when using the simple box model -- which leads to improved legibility, especially when the music gets complicated...

(My software also does automatic vertical justification using a similar algorithm. Staves and systems are automatically positioned vertically using their upper and lower borders -- which are convex envelopes, much like the sim envelopes, but also including things like ties, stafflines etc.)

Something to think about: Are you saying that each of the Levels 1-4 could be a standard, and that both producing and consuming software would decide on which standard(s) they were going to support? Maybe we should save this discussion for Frankfurt! :-)

joeberkovitz commented 6 years ago

@notator I think we have the same idea as regards the way boxes work -- the "lanes" in slide 11 are just an explanatory device to make the examples clear. But there is no need to sort boxes into lanes or layers, one can just use a union of many different boxes and fit these together as you say. @dspreadbury calls these envelopes "skylines" which I think is a very evocative term!

jsutdolph commented 6 years ago

@joeberkovitz Interesting points. Sorry, mea culpa! for 20 read 25 in my post. It is the idea of an ideal absolute width which I don't agree with. Everything needs to be scaled from the minimum note value in the system. If, say, the system contains one measure with 8 whole notes then they would probably be spaced at 10 or 20 tenths, not 50 tenths as in your table @notator Thank you for the interesting layout ideas!

notator commented 6 years ago

Apropos Slide 9: A good reason for having the vertical line centred on the right-most notehead is that quarter-note noteheads are centred over synchronous whole-note noteheads. They are not left-aligned. So we need to use the x-centre of all notehead glyphs.

My justification algorithm first spaces the sims horizontally according to their temporal locations (space=time), then uses a recursive function to remove overlaps by taking space from the larger duration classes. There is no concept of an absolute width, or scaling from the minimum note value.

dspreadbury commented 6 years ago

@notator, you say:

Apropos Slide 9: A good reason for having the vertical line centred on the right-most notehead is that quarter-note noteheads are centred over synchronous whole-note noteheads.

I can't agree on this point: the left-hand side of the front notehead (i.e. the notehead on the "correct" side of the stem, in the case of a stem that has noteheads a second apart, in which case there will also be at least one notehead on the "incorrect" side of the stem) should be used, as that is the point from which the rhythmic space occupied by the duration of the stem has to be measured. Using the horizontal centre of the notehead will require compensating for that half notehead's width in many other places.

notator commented 6 years ago

@dspreadbury Fortunately we don't have to agree on this. :-) We are just exploring the different approaches that applications take.

My own feeling is that its more intuitive for event-symbols to have an x-alignment coordinate, and (in CWMN) to define that coordinate as the x-centre of the outermost notehead on the (maybe hypothetical) stem. (The outermost notehead is always on the correct side of the stem.)

I have no problem using a sim's x-alignment position in my justification algorithm to determine the sim's absolute position on the page, so there's no question of having to compensate for half-notehead widths anywhere. This is probably just a case of swings and roundabouts. Applications just have to define the origins they are going to use, and then take the consequences.

Quite apart from its use in aligning a simple quarter-note with a simple whole-note, an event-symbol's x-alignment coordinate -- which is also its containing sim's x-alignment coordinate -- is a cursor's x-coordinate when the event-symbol (sim) starts playing. So its a useful value to have cached. A cursor is a list of [sim, instant] pairs (i.e. space-time correspondences).

joeberkovitz commented 6 years ago

@notator Noteheads with different widths are typically left-aligned, not center-aligned. There are many examples throughout classical music engraving, but to pick one take the first measure below (from Fuga IV of JS Bach's WTC, G. Henle Verlag):

image

I'm not saying it can never happen, of course. But by no means is it the norm.

(For some fascinating real-world engraving, check out the second measure above... The top voice was shifted to the left and the other 4 voices to the right, to avoid a stem collision. So three things happened at a minimum: The first sim was moved, a note within the sim was displaced to its left, and the standard spacing from the initial barline was modified to accommodate the leftward shift.)

notator commented 6 years ago

@joeberkovitz Lets agree that the way quarter-notes and whole-notes align is an engraving style. Personally, I prefer them centre-aligned (especially in complex new music). That means that in the sim at the beginning of measure 2 above, I would (centrally) align all the event symbols except the quarter-note in the middle voice on the top staff, which I would shift to the right (as above re the top whole-note). It looks especially odd to me that the two synchronous whole-notes (the principal, outer voices) are not aligned when they could be, so I don't like the solution in the above example. The only change to the normal alignments inside this sim is that the top voice has been moved to the left. I may be wrong, but as far as I remember, classic engraving rules agree with me that it should have been the middle voice on the top staff that should deviate from its normal position.

joeberkovitz commented 6 years ago

For sure, departures from the norm should be achievable within any model we adopt. I think there does need to be a single geometric origin for the model, which ideally would be based on common practice. It would be useful to get a sense of the frequency of this center-alignment; I haven't been able to find an example of it even in modern repertoire. Perhaps I'm just looking in all the wrong places!

I agree with you the treatment of the second bar is quite odd, but the engraver was certainly following some logic that we'd want to find a way to honor. Your center alignment of notes has its own musical logic, too. Capturing these kinds of human decisions is, for me, the core of what this issue is about.

notator commented 6 years ago

I'm not so sure about this engraver's logic. Editors and engravers are human, and can simply make mistakes. So:

  1. We shouldn't try to accomodate every exception to the rules in particular examples.
  2. Any standard engraving rules we adopt should be taken from standard texts on the subject.
  3. We should expect applications that deal with any but the simplest notation to provide the means for overriding the standards where necessary.
clnoel commented 6 years ago

I found a forum discussion about the same notehead-alignment topic for what it's worth, and they have a lot more examples and seem to go round-and-round about it:

http://notat.io/viewtopic.php?f=2&t=170

Personally, I think sometimes it looks better to have a whole note centered on a stemmed note, and sometimes left-aligned looks better, especially on the first beat of a measure. And, of course, there are some fonts where they are the same size, so it doesn't make a difference! Also, whichever way you do the whole note alignment, the double-whole note should align its notehead with a whole note and have the bars stick out to the left and right from there.

Here's my suggestion to wrap this up:

  1. We specify an attribute somewhere. (Score, style, global, print-info?) for "left-aligned" or "center-aligned" and make it required. Then stick to it for the entirety of the piece!
  2. Good consumers will be able to handle both cases!
  3. Consumers who wish to left-align a center-aligned piece will offset all their sim starts to the left by half their stemmed-notehead widths, and use that as the alignment location, which will nudge all the whole notes to the right.
  4. Consumers who wish to center-align a left-aligned piece will, similarly, offset their sim starts to the right by half their stemmed-notehead widths, and use that alignment location, which will nudge all the whole notes to the right.

Alternatively, we could shove this off as a font choice instead. In center-alignments, the glyph origin is in the middle of the notehead, and in left-alignments the glyph origin is on the left. That does make some difference in spacing from barlines to notes, but should make little note-to-note differences.

notator commented 6 years ago

@clnoel said

Alternatively, we could shove this off as a font choice instead.

Thats a very interesting suggestion. If that could be made to work, it would be the easiest way to go. SMuFL glyphsWithAnchors has a "noteheadOrigin" keyName that could very well do the trick.

However: I dont know enough about SMuFL to know how to provide the necessary metadata for my font, and I dont know how to make my application SMuFL-compliant (Its currently hardwired to my font's metrics).

@dspreadbury Is this going to work?

bhamblok commented 6 years ago

Since the meeting yesterday, I've been thinking about this subject in relation to the W3C CSS Grid specification. More specifically to the "minmax() explicit track sizing properties" which are coming close to visualising content in a similar way related to this issue. The "Blocking width", and "Ideal width" can be seen as the "Minimum width" (in absolute units) and a "Maximum width" (in relative units, like the css grid flexible length (fraction, fr) unit) This CSS Grid spec could be a great inspiration point to resolve our issue over here ;-)

On a side note, and despite of the great and valuable paper of John Gourlay, I also prefer the vocabulary "Fractions" instead of "Sims" and "Minimum width" instead of "Blocking width"... that's just me...

Sorry I kept my mouth shut yesterday, but I definitely think there is a great need for this "issue" to actually become a standard into this specification.

In comparison with the CSS Grid spec, which conducts browser-vendors to build their (proprietary) algorithms in such a way that a CSS-grid implementation for any website looks exactly the same in any browser, this specification should have a similar goal to allow us to finally port (mnx-) content from one application to another, having "exactly" the same visualisation.

However, IMHO this issue is about "styling", and it should be used "optionally" on top of the semantic content of MNX-Common. Just as css is "optional" for the content of a website... (If you don't add css to a website, it looks horrible, but at least it shows up)

I have a similar opinion about "offsetting" objects... Let us define this, just like css gives us the possibilities to build a website, in a static or responsive way, in absolute or relative units. I think it's up to the engraver to decide to "build/engrave" a score in a static way, or in a reflowable way. If the engraver omits the reflowing styles, the score can have at least a gracefull fallback (a static representation).

This of course can only be done after an engraver has been given the right toolsets ... AFTER have defined the specs. We are working on the future here, not the past ;-)

clnoel commented 6 years ago

@bhamblok, you are right that sims, ideal-width, blocking-width, etc. are all non-semantic information. Also, they can vary widely based on the font, font size, system width, etc. I'm wondering if we should be moving all such layout instructions into a separate element (probably the <score-layout> element being evolved in #57), rather than trying to list them in the semantic events. This might get unwieldy, since we would have to then tie these layout instructions back to the events/notes with ids. On the plus side, it would allow the different layouts required for different widths (or page-sizes) to all be described in the same document, and would make the layouts easy to ignore if you are not going to try to support someone else's layout strategy.

joeberkovitz commented 6 years ago

The co-chairs discussed this issue today.

We feel that in the long run it's important to do everything we can to better standardize the presentation of CWMN. At the same time it's obviously a thorny problem and focusing on it could delay work on the semantic underpinnings. We also need time to arrive at definitions of presentation that are both flexible enough to accommodate differing layout approaches, yet powerful enough to satisy creators and publishers.

So our suggestion is that we make slow, steady progress on understanding layout standardization better, while leaving it for the moment at a lower priority than filling out the semantic core of MNX-Common.

We also observed that while the big picture of layout is very complex, there are plenty of areas to focus on that can supply real value that are quite practical to nail down. For example, we'll need more clearly defined ways to apply spatial offsets to objects, and these offsets will need to relate to well-defined origins like beatlines and/or notehead centers. This is kind of the least we can get away with, even if it falls well short of some of the goals we set out for standardizing presentation @dspreadbury has offered to take a crack at a proposal for this.

greenSnot commented 5 years ago

Would MNX concern the semantic consistency of rendering which makes OMR possible? It needs unambiguity information from scanning music sheet, recognizes it and back to MNX. There are many cases could bring us ambiguity, like:

Though it has to result for breaking changes of WMN, I am looking forward to it.