Separate out concepts of voice for layout and voice for analysis

mdgood commented 9 years ago

MusicXML 3.0's treatment of the voice element is confusing. In its MuseData origin it was associated with editorial information and polyphonic musical analysis. In the majority of its implementations as an interchange format, it represent visual rather than analytical polyphony within a part.

Joe Berkovitz (@joeberkovitz) summarized the issue well in the MusicXML forum at http://forums.makemusic.com/viewtopic.php?f=12&t=2392#p6465:

The de facto use of the <voice> tag in MusicXML has always been to define explicit, visual polyphony within a part, rather than some abstract "analytical voice" concept. Likewise, the use of <chord> has been restricted to notes within a specific voice that are rendered as a coherent visual unit with shared alignment, stemming, beaming -- which is also the usual engraver's notion of "chord". Even though Sibelius happens to apply <voice> only to the first element of a chord, as Myke says, a <chord> (as understood by Dolet, Sibelius, Finale, Noteflight and many other programs) has always been a unitary grouping within a voice, not across voices. For the latter, as Peter pointed out, we already have the mechanism of <backup> and <voice>.

Are we now changing this interpretation? If so, this makes it difficult or impossible for MusicXML importers to determine which simultaneous notes in a <chord> should share stems, be horizontally aligned, etc. To clarify this point: if notes in a chord can have different voices, would you also say that they can have different stems? Different beams? If so, how would one determine which stem or beam applied to any given note? Is one supposed to fragment a chord into subsets that share the same <voice> (assuming that the <voice> tag is even used, since it is not required)? That would be big news to vendors, since (as with <voice>) the <stem> and <beam> elements have hitherto been associated with exactly one note in a <chord> grouping, and typically the first.

I dislike using the phrase "de facto" in a dialogue like this, but given the lack of a spec, it's the only recourse. I don't think the industry has the luxury of upending a strong de facto understanding of MusicXML's <voice> and <chord> elements. We already have <voice>/<backup> to express simultaneity across voices, and <chord> to express simultaneity within a voice. Allowing any combination of voices within a <chord> runs counter to this understanding, makes the <chord> tag essentially meaningless, and will create even more confusion in what is already a foggy semantic landscape.

MusicXML would be clarified by separating out the concepts of using voice for visual polyphony and layout vs using voice for analysis of polyphony. This could be done with a separate element for the analytic voice concept, perhaps using a related term such as "line". However, Bob Hamblok (@bhamblok) pointed out later in the same thread that this could lose a meaningful and valuable name from our markup. An alternative could be to expand the voice element with attributes or child elements:

In a reply on Michael's last post below about the voice-element... Isn't it confusing to invent new elements with other naming-conventions than our music-theory-history has thought us for years?

Is there a possibility to consider creating new attributes in the voice-element for analytical purposes? (Or maybe even creating new children within the voice-element. Namely: "layout", "analytical"... ?)

My personal opinion is that it is not logical to use an ancient naming-convention for other aims because of practical issues and inconveniences in existing software. I think, in a far future, it will serves us all when musicXML matches music-theory as much as possible... Layout is just an inheritance of this theory.

When resolving this issue it will indeed be important to keep the design independent of the limitations of much of today's music software, keeping the concepts faithful to what we see in music notation produced both by computer programs and from technology predating the computer.

This is related to issue w3c/musicxml#33 regarding documenting the interaction between the voice element and other elements like the beam element. As indicated by Joe's comments above, it may also interact with issue w3c/musicxml#35 on clarifying the chord element.

mscuthbert commented 9 years ago

For this and other analytical questions, I think that we can learn from MEI which has better support for analytical labelings in general than MusicXML. I'm not advocating adopting their whole model, but some of what they've done is well thought through -- in MusicXML a lot of their attributes should be elements, but some of them are quite useful. See: https://github.com/music-encoding/music-encoding/releases/download/MEI2013_v2.1.1/MEI_Guidelines_2013_v2.1.1.pdf Chapter 7.

dotmonkey commented 8 years ago

It's very common that two or more voices(singers) share a same staff in SATB scores.My opinion is to use to group notes that share beam,stem and time information.In the mean time,we can specify playback informations for voices independently just by adding one element per voice under .

webern commented 7 years ago

I have encountered a case where widely used applications are relying heavily on the <voice> element, in contradiction to other widely used applications.

In my mind, and in Komp, our idea of a 'voice' is such that, in a given staff, if you wanted to have two voices, i.e. stems up and stems down, these would be in separate 'voices'. In Komp, if we have, say a Piano with two staves, I would write notes into the upper staff as 'voice 1' and I would also enter notes into the bottom staff as 'voice 1'. I emit MusicXML consistent with this worldview by writing first the upper staff, all tagged with 'voice 1', then backup, then write the lower staff, all tagged with 'voice 1' (see attached MusicXML file).

Finale and MuseScore both interpret my MusicXML as I expect, but both Dorico and Sibelius provide undesirable results when importing my MusicXML (see attached images). In fact, it seems Dorico actually requires that 'voice 2' be specified in the lower staff (i.e. if I just omit the voice tags, which should be valid MusicXML, Dorico misbehaves).

I thought I would provide this example of confusion between applications regarding the use of the 'voice' tag.

Thanks. Matt

Komp grand staff trial.musicxml.zip

Dorico grand staff trial - dorico

Finale grand staff trial - finale

MuseScore grand staff trial - musescore

Sibelius grand staff trial - sibelius

jsawruk commented 7 years ago

@Webern: I have also encountered your issue, and I believe that the voice counter should reset for each staff, rather than for each part. In this case, I agree with your Komp output and the interpretations provided by Finale and MuseScore. However, I'm not sure I understand how this relates to the issue of encoding analytical and layout voices differently.

I think that it would be good to separate the two concepts. I like @mscuthbert's suggestion of using the attributes from MEI, as this would not be a breaking change (though it wouldn't resolve the semantic issue in the Komp examples).

webern commented 7 years ago

Perhaps didn't find the best thread, but didn't think it was a new issue.

mdgood commented 7 years ago

Different voices should be unique on a per-part basis, not per-staff. Otherwise you would not be able to represent cross-staff notation correctly. This isn't well-documented in the schema, though it is mentioned in the tutorial at http://www.musicxml.com/tutorial/notation-basics/multi-part-music-2/. Perhaps we should have a separate issue for clarifying this in the schema documentation.

We are currently saving this particular issue for MNX, where sequences represent what is described here as voices for layout.

webern commented 7 years ago

OK, in that case I think my Komp output is essentially "undefined behavior" since it places notes in the same voice, at the same time, but does so without using the <chord> tag.

Unfortunately it seems that specifying the concept of 'voice' as Komp understands it is not possible with MusicXML. As a workaround I will, on import, check to see if all the notes in the second staff have an identical voice value, and send them all to my 'voice 1' if they do (regardless of the actual value).

I will also output voice numbers that comply with the spec.

Thanks.

mdgood commented 4 years ago

I am moving this to the MNX repository for further consideration in MNX-Common. MNX-Common sequences already are doing part of this work. However I don't think we yet have a complete solution for the issues of voices for analysis and other uses raised here.

Issue w3c/musicxml#294 addresses a related issue of being able to separate out players on a shared part, like the SATB that @dotmonkey mentioned in this discussion.

w3c / mnx

Separate out concepts of voice for layout and voice for analysis #183