rism-digital / verovio

🎵 Music notation engraving library for MEI with MusicXML and Humdrum support and various toolkits (JavaScript, Python)
https://www.verovio.org
GNU Lesser General Public License v3.0
649 stars 177 forks source link

Add timestamp attribute for each note in svg #379

Closed k-ljung closed 7 years ago

k-ljung commented 7 years ago

Is it a bad idea to be able to generate the time stamp for each note in the SVG file?

I want to generate the SVG file on a server application and use it on a client and be able to select active note(s) at a specific time, just as getElementsAtTime (milliseconds) works, without having to load the EMI file.

craigsapp commented 7 years ago

It depends what you mean. If you mean a physical timestamp such as seconds or milliseconds, I would say that is bad: the score can represent multiple performances, each potentially with their own timings. Likewise, if the tempo of the MIDI conversion from MEI changes you would potentially have to change all of the timings in the SVG (but this is not much of a problem, as you could regenerate the SVG images at the same time as the new MIDI file). But if you are dynamically generating new timestamps, then that means that you also have access to the MEI data, which defeats the purpose of why you are asking.

Instead, a score-based timestamp system would definitely be of general use to be embedded in SVG output from verovio. This could be used to look-up a physical time that that location in the score should be played. Note that a single point in the score can have multiple realtime positions in seconds due to repeats (this is another reason to not use physical timestamps since a note would require multiple labels for each repeat in the score). Unlike physical timestamps in seconds, the score-based timestamps would be immutable: it specifies a generalized time for the note, not the actual time in seconds. The timemaps could be embedded in the original MEI data, or extracted from the MEI data, or stored separately from the MEI data (as I do it).

I use quater-note timestamps ("qstamps") which are durations from the start of the musical score to the current note in terms of quarter notes. Here is a demo page: http://www.humdrum.org/vhv-demos/recordings There is a menu in the top right where you can select a particular recording. Then press the space bar or the play button at the bottom left to start playing. Also you can click on any note to start playing the recording at that note. Note that you must stop a recording before switching to a new recording, otherwise both will be playing, and I do not enable automatic scrolling when the music goes off of the page (this is only a demo, so no bother with refinements). Also note there is no server involved in this case (recordings are just random URLs from the internet).

For this demo, I insert qstamps into the SVG @class, which you can see if you inspect the SVG in a browser developer tools window:

screen shot 2016-12-05 at 15 57 36

Verovio inserts a <g> element for each note that wraps the graphics for drawing that note. On this element, I extract the xml::id for the note via JavaScript, and then I refer back to the original score for the qstamp of the start and stop of the note which gets inserted into the @class tags for the <g> element. I do this dynamically, but it could also be done statically to prepare an svg score.

Specifically, I extracted the id: id="note-L40F3" from the SVG note, then went back to the original score and identified when that note with the same xml::id should start and stop in terms of quarter notes since the beginning of the piece, then I inserted the class tags: class="noteon-3d5 noteoff-4" which means the note should be turned on at quarter note 3.5 and turned off at quarter note 4. Note that I encode the decimal as ".", as class tags are not allowed to have "." characters. Also you cannot start class tags with digits.

I do all of this outside of verovio (in JavaScript), but it would be handy if verovio added these qstamps to the SVG as verovio creates it.


Then to assign a physical time, I have a lookup table from "qstamp" (quarter-note timestamp) to "tstamp" (physical timestamp in seconds):


{
    "work-title":   "Chopin, Etude in F minor, Op. 10, No. 9",
    "performer":    "Mehmet K. Okonsar",
    "audio":    { "file":"http://imslp.org/images/7/73/PMLP01969-Complete-Opus10_128Kbps.mp3", "type":"audio/mpeg" },
    "timemap": [
    {"qstamp":0,    "tstamp":1215.498},
    {"qstamp":1.5,  "tstamp":1216.505},
    {"qstamp":3,    "tstamp":1217.303},
    {"qstamp":4.5,  "tstamp":1218.102},
    {"qstamp":6,    "tstamp":1218.873},
    {"qstamp":7.5,  "tstamp":1219.617},
    {"qstamp":9,    "tstamp":1220.486},
    {"qstamp":10.5, "tstamp":1221.564},
    {"qstamp":12,   "tstamp":1222.628},
    {"qstamp":13.5, "tstamp":1223.523},
    {"qstamp":15,   "tstamp":1224.307},
    {"qstamp":16.5, "tstamp":1225.127},
    {"qstamp":18,   "tstamp":1225.919},
    {"qstamp":19.5, "tstamp":1226.646},
    {"qstamp":21,   "tstamp":1227.503},
    {"qstamp":22.5, "tstamp":1228.533},
    {"qstamp":24,   "tstamp":1229.863},
    {"qstamp":25.5, "tstamp":1230.702},
    {"qstamp":27,   "tstamp":1231.433},
    {"qstamp":28.5, "tstamp":1232.155},
    {"qstamp":30,   "tstamp":1232.907},
    {"qstamp":31.5, "tstamp":1233.636},
    {"qstamp":33,   "tstamp":1234.564},
    {"qstamp":34.5, "tstamp":1235.524},
    {"qstamp":36,   "tstamp":1236.461},
    {"qstamp":37.5, "tstamp":1237.3},
    {"qstamp":39,   "tstamp":1238.059},
    {"qstamp":40.5, "tstamp":1238.799},
    {"qstamp":42,   "tstamp":1239.536},
    {"qstamp":43.5, "tstamp":1240.295},
    {"qstamp":45,   "tstamp":1241.333},
    {"qstamp":46.5, "tstamp":1242.396},
    {"qstamp":48,   "tstamp":1243.765},
    {"qstamp":49.5, "tstamp":1244.974},
    {"qstamp":51,   "tstamp":1245.994},
    {"qstamp":52.5, "tstamp":1246.92},
    {"qstamp":54,   "tstamp":1247.781},
    {"qstamp":55.5, "tstamp":1248.625},
    {"qstamp":57,   "tstamp":1249.495},
    {"qstamp":58.5, "tstamp":1250.429},
    {"qstamp":60,   "tstamp":1251.593},
    {"qstamp":61.5, "tstamp":1252.603},
    {"qstamp":63,   "tstamp":1253.508},
    {"qstamp":64.5, "tstamp":1254.416},
    {"qstamp":66,   "tstamp":1255.301},
    {"qstamp":67.5, "tstamp":1256.136},
    {"qstamp":69,   "tstamp":1256.972},
    {"qstamp":70.5, "tstamp":1257.837},
    {"qstamp":72,   "tstamp":1258.691},
    {"qstamp":73.5, "tstamp":1259.603},
    {"qstamp":75,   "tstamp":1260.368},
    {"qstamp":76.5, "tstamp":1261.142},
    {"qstamp":78,   "tstamp":1261.977},
    {"qstamp":79.5, "tstamp":1262.707},
    {"qstamp":81,   "tstamp":1263.396},
    {"qstamp":82.5, "tstamp":1263.877},
    {"qstamp":84,   "tstamp":1266.069},
    {"qstamp":85.5, "tstamp":1267.023},
    {"qstamp":87,   "tstamp":1268.589},
    {"qstamp":88.5, "tstamp":1269.802},
    {"qstamp":90,   "tstamp":1271.333},
    {"qstamp":91.5, "tstamp":1272.284},
    {"qstamp":93,   "tstamp":1273.699},
    {"qstamp":94.5, "tstamp":1274.967},
    {"qstamp":96,   "tstamp":1276.569},
    {"qstamp":97.5, "tstamp":1277.629},
    {"qstamp":99,   "tstamp":1278.874},
    {"qstamp":100.5,    "tstamp":1280.353},
    {"qstamp":102,  "tstamp":1281.831},
    {"qstamp":103.5,    "tstamp":1282.842},
    {"qstamp":105,  "tstamp":1284.349},
    {"qstamp":106.5,    "tstamp":1285.863},
    {"qstamp":108,  "tstamp":1288.178},
    {"qstamp":109.5,    "tstamp":1289.141},
    {"qstamp":111,  "tstamp":1289.85},
    {"qstamp":112.5,    "tstamp":1290.653},
    {"qstamp":114,  "tstamp":1291.359},
    {"qstamp":115.5,    "tstamp":1292.143},
    {"qstamp":117,  "tstamp":1292.941},
    {"qstamp":118.5,    "tstamp":1293.768},
    {"qstamp":120,  "tstamp":1294.739},
    {"qstamp":121.5,    "tstamp":1295.536},
    {"qstamp":123,  "tstamp":1296.295},
    {"qstamp":124.5,    "tstamp":1297.029},
    {"qstamp":126,  "tstamp":1297.813},
    {"qstamp":127.5,    "tstamp":1298.58},
    {"qstamp":129,  "tstamp":1299.384},
    {"qstamp":130.5,    "tstamp":1300.38},
    {"qstamp":132,  "tstamp":1301.624},
    {"qstamp":133.5,    "tstamp":1302.541},
    {"qstamp":135,  "tstamp":1303.274},
    {"qstamp":136.5,    "tstamp":1304.073},
    {"qstamp":138,  "tstamp":1304.803},
    {"qstamp":139.5,    "tstamp":1305.532},
    {"qstamp":141,  "tstamp":1306.366},
    {"qstamp":142.5,    "tstamp":1307.182},
    {"qstamp":144,  "tstamp":1308.132},
    {"qstamp":145.5,    "tstamp":1309.005},
    {"qstamp":147,  "tstamp":1309.813},
    {"qstamp":148.5,    "tstamp":1310.621},
    {"qstamp":150,  "tstamp":1311.597},
    {"qstamp":151.5,    "tstamp":1312.446},
    {"qstamp":153,  "tstamp":1313.151},
    {"qstamp":154.5,    "tstamp":1313.943},
    {"qstamp":156,  "tstamp":1315.363},
    {"qstamp":157.5,    "tstamp":1316.489},
    {"qstamp":159,  "tstamp":1317.274},
    {"qstamp":160.5,    "tstamp":1317.985},
    {"qstamp":162,  "tstamp":1318.675},
    {"qstamp":163.5,    "tstamp":1319.306},
    {"qstamp":165,  "tstamp":1320.014},
    {"qstamp":166.5,    "tstamp":1320.887},
    {"qstamp":168,  "tstamp":1322.701},
    {"qstamp":169.5,    "tstamp":1324.083},
    {"qstamp":171,  "tstamp":1325.399},
    {"qstamp":172.5,    "tstamp":1326.76},
    {"qstamp":174,  "tstamp":1327.988},
    {"qstamp":175.5,    "tstamp":1329.013},
    {"qstamp":177,  "tstamp":1330.292},
    {"qstamp":178.5,    "tstamp":1331.46},
    {"qstamp":180,  "tstamp":1332.857},
    {"qstamp":181.5,    "tstamp":1334.011},
    {"qstamp":183,  "tstamp":1335.371},
    {"qstamp":184.5,    "tstamp":1336.935},
    {"qstamp":186,  "tstamp":1338.256},
    {"qstamp":187.5,    "tstamp":1339.38},
    {"qstamp":189,  "tstamp":1340.902},
    {"qstamp":190.5,    "tstamp":1342.364},
    {"qstamp":192,  "tstamp":1343.849},
    {"qstamp":193.5,    "tstamp":1344.737},
    {"qstamp":195,  "tstamp":1345.389},
    {"qstamp":196.5,    "tstamp":1346.043},
    {"qstamp":198,  "tstamp":1347.367},
    {"qstamp":199.5,    "tstamp":1349.34}
    ]
},

In this case the performance of the etude is embedded in a longer recording, so the time stamps may seem large since I am skipping over the previous etudes in the opus on the recording.

If a "qstamp" is not found in the list, then I do an interpolation based on adjacent qstamps. If you look at the source code there are more parameters in the lookup table, but these are only for future display of the current measure and beat in the score, but not used in the actual score/audio alignment.

Once an SVG score is marked up with qstamp on/off times, I can do audio/score alignment with output from any notation software. For example here is the same system being used on graphic notation from another program: www.ccarh.org/haydn/op20n5/mvmt2

For this example, notice that there are repeats. You can see in the embedded lookup table that the tstamps are unique, but the qstamps repeat.

For this example, if you type "1" before clicking on a note, that will play the first repeat version. If you type "2" and click, then the second repeat will be played. For the minuet, there is also a "3" version which is the da capo repeat.

craigsapp commented 7 years ago

Related to https://github.com/humdrum-tools/verovio-humdrum-viewer/issues/14

@lpugin: How can a webpage be aware of the currrent audio playback of the audio so as to align the visual playback when the tempo of the MIDI can change?

rettinghaus commented 7 years ago

We need something like m_currentBpm in the toolkit for tempo changes. Ideally that should come from iomidi. For now we could add ScoreDef m_scoreDef; to the toolkit to get the stored midi.bpm.

craigsapp commented 7 years ago

m_currentBpm will not work in the general sense. You would need to know the history of all previous tempo changes to any given point in the score in order to calculate the current physical time position; otherwise, only a single tempo setting at the start of the music will work.

The best system would be to query the MIDI playback device as to what time it is currently playing at in the file. This would then be sent to verovio which would reply with the active elements IDs.

Something similar to HTML 5 <audio> and <video> .currentTime for the MIDI playback:

http://www.w3schools.com/tags/av_prop_currenttime.asp

This is what I use in the qstamp method described above since I am using <audio> to playback the

A hack would be to pre-calculate the audio (in the webpage), and then use the <audio> element for playback instead of a real-time MIDI playback system. But not optimal, particularly for long works.

What is the state of webmidi? That is probably not outside of development versions of browsers, and that mostly seems to be related to hardware synthesizers on a local computer being accessed via a browser (which would result in non-standard interface to MIDI across different computer types).

rettinghaus commented 7 years ago

So as long as time changes aren't integrated we can rely on midi.bpm from scoreDef.

While generating the midi output we always now the exact moment of time. I don't have a clue, but would it be possible to stream it directly to the player?

craigsapp commented 7 years ago

So as long as time changes aren't integrated we can rely on midi.bpm from scoreDef.

Yes -- as long as the time change is at the start of the file (120 would be presumed before any non-initial tempo change. But see my change of view below in thinking that the problem is only inside of verovio and not in the MIDI player (if so, a constant tempo change might not work without some extra but small code changes in verovio).

While generating the midi output we always now the exact moment of time. I don't have a clue, but would it be possible to stream it directly to the player?

You will have to explain your Germanglish :-)

The problem seems to be that verovio does not contain a timemap (or the timemap is set to be constant at 120 bpm). When the MIDI file is generated from verovio, you could fill in this timemap. It is not necessary for the MIDI conversion but is a reasonable place to put it. Then .getElementsAtTime() should return the correct elements, since it looks like the MIDI player is already set up to handle tempo changes:

   var midiUpdate = function(time) {
        var vrvTime = Math.max(0, 2 * time - 800);
        var elementsattime = vrvToolkit.getElementsAtTime(vrvTime);
        if (elementsattime.page > 0) {
            if (elementsattime.page != page) {
                page = elementsattime.page;
                load_page();
            }

In other words, It looks like the MIDI player calls midiUpdate with a time that it knows it is currently playing at and then verovio is asked about what elements to highlight in the notation based on that time. So the problem is probably that verovio has a hard-wired timemap. Generating the time-in-seconds values from the MIDI file will be easy: as I wrote the underlying midifile library and I know there is already a time-in-second calculator from the tempo messages in it. You (@lpugin) could tell me where to store the timing data. To do this I would need to know how score time is stored or something like that. And as I want to hear music at the correct tempo as well, I will get to work on it immediately :-)

rettinghaus commented 7 years ago

In layerelement m_totalTime is used to store the total time (in midi ticks). (Forget my latter gibberish, had a long day :-)

notator commented 7 years ago

Hi, I'm new here, but not to the subject matter. Brief introduction: I was Karlheinz Stockhausen's copyist 1974-2000, took part in the discussions during the early stages of developing the Web MIDI API, and have written my own SVG score generator [1] and player [2] (the player uses the Web MIDI API). Also, I've been corresponding privately with @lpugin and @ahankinson, and have been watching this repository for some days now. I've understood the role of .mei files in the code at [5], but otherwise MEI and Verovio are pretty much black boxes. See also my web site [3] and GitHub [4].

I want to keep this posting as short as possible while getting a few general concepts across. Once the principles are clear, you can take it (or leave it) from there.


I think that when Verovio exports a MIDI file, it should export a parallel, synchronized SVG file at the same time (using the same command). It should also be possible (using a different command) to add the synchronization data to an SVG file that is already synchronized with another file.

I'm assuming that Verovio will continue to write all its @class attributes (except, maybe, one) as at present. Writing the SVG in the following way does not affect the way the graphics can be edited.

The following SVG structure

score:synchronizedFilesList="URL.mid, URL.mp3" This <svg> attribute will be ignored by applications (e.g. browsers) that are only interested in the graphics. Its value is a list of audio and MIDI files that are synchronized with the graphics in this file. These URLs can be used to actually load the files at runtime. Each file has a corresponding <score:msTimestampList ... /> inside each <g class="eventSymbol"...>

<g class="eventSymbol" score:xalignment="1234.5678" ...> An eventSymbol is a layer component that is an arbitrarily complex graphic element associated with one or more arbitrarily complex temporal events. The simplest graphic would be a single character or path. The simplest temporal (MIDI) event would be a single note. So an eventSymbol's graphics could consist of a single notehead character. But the graphics could also be an arbitrarily complex chord symbol or some Mediaeval neume or ligature, or a tablature of some kind etc. The temporal definition of an eventSymbol can also be a sequence of MIDI chords (i.e. a complex ornament), regardless of its graphic representation. The connection between the graphic and temporal definitions of a symbol is determined by the application that creates the file, not by the SVG format. That's important, because it allows new applications to be written, that define new notations. Or new applications to be written for rare, old notations. And all such notations will be performable by the same performance software if the SVG obeys the same contract.

@score:xalignment is the x-coordinate of the eventSymbol's alignment point (the coordinate that is shared by all synchronous eventSymbols. The existence of this attribute means that the performing software does not have to look at the details of the eventSymbol's graphics. Performing software does not have to know which particular notation is in use. The value of score:xalignment is known by the creating software at creation time. It is not clear to me whether the eventSymbol is always a note in MEI/Verovio, or whether it is sometimes a chord. Either way, I think it should be called an eventSymbol (or something similar/shorter) in the exported SVG.

<score:msTimestampList ... /> Each file in the 'score:synchronizedFilesListhas a corresponding<score:msTimestampList ... /> inside each eventSymbol. These elements could be structured differently. Its just a question of style. The individual values are milliseconds here. That's because 1) this file is notation agnostic (qstamps don't exist because there are not necessarily any quarter-notes) and 2) The Web MIDI API only uses milliseconds, so its as well to be consistent with the web. The Web MIDI API also dropped MIDI ticks and all references to tempo. Its up to the performing software to manage performed durations.

Note that it is quite straightforward for Javascript to create a (timestamp-->eventSymbolElements) dictionary when the above file has loaded.

I'll stop there. Further discussion would probably be useful. :-)

All the best, James

-- [1] https://github.com/notator/Moritz [2] https://github.com/notator/assistant-performer [3] http://james-ingram-act-two.de [4] https://github.com/notator [5] https://github.com/rism-ch/verovio-tutorial/blob/gh-pages/topic09-midi.html#L101-L148 [6] https://www.w3.org/TR/SVG11/extend.html [7] https://www.w3.org/TR/SVG11/extend.html#PrivateElementsAndAttribute

lpugin commented 7 years ago

My feeling is that you are trying to achieve to much in one go, and this always become problematic. That is, to have in one place a graphical representation and synchronisation with an unlimited number of audio files. Also, you seem to want to have something that is notation agnostic, but at the same time rely on a /system/staff/layer/event organisation, which does make a strong assumption about the structure of the notation. For me there is an obvious contradiction here.

So I would recommend to step out and try to think your proposal as something independent that does one thing: the link between an audio representation and a graphical one. You can certainly use the Verovio output or this. (I assume you would refer to the note or chords ids). This would make it much clearer, completely independent from any current implementation plus this would remove the assumption that the notation has to have a /system/staff/layer structure since you would be able to point potentially to any graphic in any SVG.

notator commented 7 years ago

@lpugin

My feeling is that you are trying to achieve to much in one go, and this always become problematic. That is, to have in one place a graphical representation and synchronisation with an unlimited number of audio files.

I wanted to show that all the problems raised earlier in this thread are, in principle, solvable. If you think it would be a good idea to have a simpler solution that can only synchronize one audio file, then that would be easy enough to arrange.

Also, you seem to want to have something that is notation agnostic, but at the same time rely on a /system/staff/layer/event organisation, which does make a strong assumption about the structure of the notation. For me there is an obvious contradiction here.

The /system/staff/layer/eventSymbol organisation is not as strong an assumption as it might look. It has to do with the way brains use hierarchic chunking to extract meaning from the marks on a 2-dimensional, page-sized surface. All music notations have eventSymbols (just as all writing systems have words) at the bottom of a hierarchy of containers. Some notations (e.g. for solo instruments) use less containers, but these can be fit into the /system/staff/layer/eventSymbol scheme by assuming one layer per staff and one staff per system. I don't know of any music notation that needs more than three containers on a page. One could, of course, define different contracts for solo and ensemble scores, but I think the advantage of being able to use the same performance software on both outweighs the disadvantage of having redundant containers.

I don't really understand your second paragraph. We seem to agree that my proposal links one or more audio representations with the graphical one, and that Verovio can be used to create such an SVG file, but you lost me on the ids. I didn't use ids in the above SVG code because they are not needed by a performing application (the @score:xalignment values can be used instead, in conjunction with the position in the container hierarchy), but I assume Verovio would still write them because they would be used by graphics applications (for connecting slurs to noteheads etc.) .

lpugin commented 7 years ago

We seem to agree that my proposal links one or more audio representations with the graphical one, and that Verovio can be used to create such an SVG file, but you lost me on the ids.

Yes, your proposal is about the link between an audio representation and a graphical one. What I am suggesting is not to put this in the SVG (Verovio or other) because it does not have to be there. You can very well have this as a separate file. Then you do need the ids to make the link. On that point there is not plan to remove ids from the SVG output of Verovio because they are at the core of its concept.

This is how I envisaged to solve the issue raise in this thread. That is to add an option to output a json file with for each note id the note on and note off timestamp. Something as simple as:

{
"note1" :[{"on": "000.0"}, {"off": "010.0"}],
"note2" :[{"on": "000.0"}, {"off": "020.0"}]
}

This will some the issue raise by @k-ljung and also the fact that currently the midi playback requires many calls to getElementsAtTime().

lpugin commented 7 years ago

Since it will be mostly used to highlight notes it is probably more useful to have it flipped.

{
    "on": {
        "000.0": ["note1", "note2"]
    },
    "off": {
        "010.0": ["note1"],
        "020.0": ["note2"]
    }
}
notator commented 7 years ago

Understood. You could also export different json files to link to different audio files. :-)

Which units are you using for the numbers? I'd prefer integer milliseconds on the web, and for linking to audio recordings. As I said, the Web MIDI API only uses milliseconds. It has dropped the concepts of "quarter-note", "beat", "tempo" and "MIDI-tick", so asking your consumers on the web to use qstamps is a bit of a problem.

Your solution is fine if you just want to highlight the performing notes. A more powerful user interface would allow users to set performance-start and performance-end markers so as to study a particular section of the score. That's a something that can't be implemented if the performing app's programmers have no access to each eventSymbol's @score:xalignment position. A performing app should not have to access the eventSymbol's graphics at all, but even if it did (and we restricted this to CWMN) the alignment can't be extracted reliably from the graphics. So Verovio should simply save the value as an eventSymbol attribute. Small effort, big effect. BTW: <g class="rest" ...> also belongs in my proposal with a @score:xalignment attribute. The only difference between a rest and an eventSymbol (according to my definitions above) is that a rest points at silence. If the @score:xalignment coordinate is available, a running cursor can also be implemented. Users are used to seeing such running cursors in music editors and DAWS, and it would be nice to keep them happy.

Edit: Could you export the x-alignment coordinate of each note (or chord or rest) in your json file?

ahankinson commented 7 years ago

I know I'm re-hashing our offline conversations, but I'll bring it up again: Why not use MEI?

notator commented 7 years ago

I think its possible for Verovio to define one (or more) synchronised SVG format(s) that could be created by any program (for example any music notation editor) that might want to do the same thing, and I think that would be well worth doing. Then the apps that take such SVG files as input would be universally interoperable (to use some nice jargon). I'm talking about defining one (or more) standard interface(s) between SVG producers and consumers, not about the capabilities of any one SVG producer. Incidentally, I think such standardized SVG formats would be especially useful for archiving scores in an application-independent way. The formats I have in mind only use universally accepted standards. That ought to be of interest to lots of MEI users...

lpugin commented 7 years ago

One question: what applications currently use this kind of synchronised SVG that would benefit from a standardisation effort?

lpugin commented 7 years ago

For the archiving, you better use the MEI itself than a graphical representation of it.

notator commented 7 years ago

Other applications: All music notation editors can, in principle, export both SVG and MIDI. The most important ones (Finale, Sibelius, Dorico, musescore) already do, but they don't synchronize the graphics with the MIDI. Neither do they add class information to the SVG. As far as I know, Verovio is the only program to do that (apart from my Moritz, which we can ignore here). Adding music-notation-specific class info to SVG files is so useful for client applications, that all music-score creation programs ought to do it (in a standard way).

I know that more than one of the above applications uses the Qt SVG exporter. If there were standard ways to export SVG files representing music scores (synchronized or not), then it would also be possible to write specialized software modules that such programs could use instead of the Qt SVG exporter.

Archiving: You're right. MEI and the Verovio's SVG output shouldn't be confused. Converting a strongly typed SVG back to MEI can probably be done, but I imagine that the resulting MEI file would not contain all the info contained in the original MEI file. For example, the MEI customisation info gets lost when creating the SVG. Metadata: SVG files have quite powerful metadata capabilities, and Verovio could probably add all MEI's metadata info to the SVGs it creates, but I doubt that all music notation creating apps can be relied on to fill in all the fields correctly and in such detail! The answer might look something like this:

lpugin commented 7 years ago

So basically only your tool is meant to read this. The tools you are listing (Finale, Sibelius, Dorico, etc) all export SVG as Verovio does, namely as a graphical output. Indeed Verovio distinct itself by preserving a semantical structure. It is quite different in that regard but this structure is designed as an interaction layer, and nothing more. By no mean it is meant to become an archival format because it will never be semantically rich enough. I also do not expect any music notation software to eventually read or import it. It would be completely wrong to me because they should import MEI instead. Reading the SVG output of Verovio would basically be the equivalent of an optical music recognition process with pre-extracted and pre-labelled graphic primitives, or if you prefer the equivalent of importing a Score format. This is not trivial.

I am all in favour of standard, but for a standard to be developed there is to be a need for standardisation. It is not obvious there beyond your own application. Also, if you are looking at developing a standard it would be wrong to proceed by starting any sort of implementation, be it with Verovio or anything else. This would heavily bias it, even more with Verovio because it is directly bound to MEI. Your standard would indirectly be influenced by the development of MEI, which would be quite bad because it seems you have different needs than what MEI offers.

Now if you are looking at implementing a tool, Verovio is all there for you to do it, including to be modified as you need - you can fork it for this if you want to make your changes available to others. This is the nice side of open-source and community-based development ;-)

craigsapp commented 7 years ago

The Web MIDI API also dropped MIDI ticks and all references to tempo. Its up to the performing software to manage performed durations.

That is unfortunate for interfacing with musical scores for and score/audio alignment. Real-time is the representation of time in MIDI rendering, and the implementers/users understandably want the interface to match to other real-time systems such as the <audio> and <video> elements. However, MIDI files are encoded in score-time (ticks) so they are internally using score-time, and the Web MIDI API has knowledge of this information regardless of if it is made accessible of not. Of course, MIDI files can represent real-time directly (SMPTE), and there is no requirement that the ticks be used as score-time, such as when a musician records notes as they perform on a MIDI keyboard. In that case "real-time" and "tick-time" are treated as equivalent.

Why should verovio be responsible for managing real-time events if a system such as Web MIDI does not handle the converse case of score-time events?

For audio recording to score alignment, you system seems good for cases where the real-timestamps are static, and I cannot think of any potential problems (other than mostly trivial complexities due to repeats). This would make it suitable for embedding alignments to audio/video recordings.

But MIDI real-times are not static, as they are influenced by tempo-change meta messages. When changing a tempo message in a MIDI file, nothing else in the MIDI file changes other than the insertion of a tempo change event (likewise in a graphical musical score). However, the real-time positions of all events after the tempo change will be altered.

If verovio were to manage real-time positions of MIDI files, then any external change of tempo in the MIDI file would invalidate the real-timestamps in the SVG output from verovio. Linking between the MIDI file and the SVG would then be fragile and therefore dependent on verovio and the underlying MEI data remaining accessible for recalculating updated real-timestamps. This would also require the SVG to be re-inserted into the DOM, which would mess up the optimization that I do in javascript, where I store references to the SVG elements in the timestamp and not the xml::ids.

A general concern that should be addressed is: suppose you want to create a web interface where the user can control the tempo (or more precisely a tempo scaling) of the MIDI rendering? There is no practical way of implementing that within verovio or the SVG output. A complicated hack could be done, but the obvious thing would be to send the tempo scaling information to the Web MIDI API. But any change in tempo scaling change is equivalent to inserting a tempo change at the current point in the musical-score/MIDI file, and thus all real-time positions in the SVG would be invalidated. Any solution on the verovio side to do this would be incredibly inefficient and noticeable if the user were to continuously change a tempo-scaling slide on the webpage.

An interesting solution would be if the Web MIDI API were allowed to import SVG files of graphical music notation which contain the equivalent of a MIDI file. There should also be a real-time tempo scaling interface built into Web MIDI API, as this would not be practical to implement in MEI, verovio or SVGs output from verovio.

Semi-related, there is a callback system in Web MIDI which triggers whenever there is a note-on or note-off occurring in the MIDI file. This would be the proper way to do score/MIDI alignment, and works similar to the Wild Web MIDI interface that is currently being used with verovio data. The <audio> and <video> HTML elements do not have a good callback system for updates related to highlighting notes, with the update callback not guaranteed or known to run at a specific rate. It is guaranteed to be no more than 250 ms, and web browsers all implement at this upper limit which is insufficient for score/audio alignment.


That's because 1) this file is notation agnostic (qstamps don't exist because there are not necessarily any quarter-notes) and 2) The Web MIDI API only uses milliseconds, so its as well to be consistent with the web.

Obviously you come from an electronic-music point of view. qstamps or ticks (score-time) are a more generalized form of representing time in musical scores, not the other way around by using real-time. Having any score-time base system implement real-time API just because systems for "real-time" implement a real-time system is irrelevant to a consistency argument. It can be done, but there are limitations to doing so. Implementing something with limitations is fine as long as those limitations are known beforehand, but don't be surprised that an implementation cannot handle cases outside of these limitations. Making a system containing limitations a "standard" is a bad thing in general. This will result in the need for another "standard" which can handle the limitations, or try to be more generalized to handle all limitations in another standard. Hence the existence of thousands of musical score representation systems. https://xkcd.com/927

Note that the qstamps I describe above in this thread are floating-point numbers. You could use real-time stamps in there if you want; there is no particularly need for the score-time to be described in quarter note units. The qstamps are analogous to IDs into the score, and the tstamps are analogous to IDs into the audio. So that I call them qstamps is beside the point, and they do not need to be in units of quarter-note durations. If the score does not have repeats, substituting tstamps for qstamps does not create problems. But when there are repeats, the tstamps cannot reprent IDs in the score since there are multiple tstamps referring to one qstamp.

So, a problem when interfacing scores and audio is dealing with repeats. Any solution done within verovio and SVG output must be able to handle repeats. In my alignment implementation, "qstamps" are a monotonic sequence of event times in the score, and "tstamps" are monotonic sequences of the events in an audio rendering of the score. If there are no repeats, then there are no problems as both can be described montonically in a timemap. But when there are repeats, the "qstamps" are no longer monotonic against the tstamps. My timemaps are indexed by real-timestamps since that always must be monotonically increasing, and the qstamps are allowed to jump around to any value. This allows for efficient lookup of qstamps given a particular tstamp, and makes the timemap agnostic of repeat structure in the MEI/SVG data. Reversing the indexing to qstamps works, just that it would have to be reversed back into indexing by tstamps in order to lookup the score-time position in the score at a given real-time. And this reversal in more complicated since there is not a one-to-one mapping between qstamps and tstamps due to repeats in the score.

Also, "score-time" can go back in time (such as to a previous page when repeating), but "real-time" cannot. How should repeats be handled when an SVG element is turned on on one page and turned off on another? In other words, an SVG file with repeats will refer to an element in another SVG which could be on any previous page in the score. For non-repeat cases, there still is a problem, but obviously the next page is where it should be found.

It is important to conceptualize that there are two ways of describing time. "real-time" is expressed in seconds/milliseconds. "score-time" is described in virtual units, such as quarter notes, or ticks in MIDI files. Score-time can be represented in real-time in certain cases, but there are limitations in doing that for general cases, which in this case involves MIDI.

An analogy between score and real times can be made in programming: "Real-time" is like a variable, and a "score-time" would be like a pointer to that variable. When you change the tempo, the "score-time" pointers would remain unchanged, but the actual "real-time" variables would change.

notator commented 7 years ago

@lpugin

So basically only your tool is meant to read this.

No. The above proposal shows how I would write SVG that can be synchronized at runtime with one or more external MIDI or audio files. However, following @craigsapp's comments, I'm no longer sure that Verovio's midiplayer and vrvToolkit have really solved the underlying problem, so that approach may not be viable after all. My own producing and consuming apps use a completely different approach. We'll just have to wait and see where further discussion leads. See my answer to @craigsapp below.

The tools you are listing (Finale, Sibelius, Dorico, etc) all export SVG as Verovio does, namely as a graphical output. Indeed Verovio distinct itself by preserving a semantical structure. It is quite different in that regard but this structure is designed as an interaction layer, and nothing more.

The semantics are vitally important when it comes to programming rich client applications. That's true whoever wrote the SVG.

I am all in favour of standard, but for a standard to be developed there is to be a need for standardisation.

There are different kinds of standards. I think Verovio is free to define contract(s) between SVG producers and consumers that other applications can use if they want to. Such contracts can become de-facto standards if other apps adopt them. In other words, Verovio does not need to wait for proof of a "need for standardisation", before proceeding to develop contracts that it finds useful for itself. Such contracts can change/improve over time, if and when other apps become involved.

Now if you are looking at implementing a tool, Verovio is all there for you to do it, including to be modified as you need - you can fork it for this if you want to make your changes available to others. This is the nice side of open-source and community-based development ;-)

I'd like to help you develop contracts that other applications can use, but I currently think that that would best be done by staying on the client side of the interface. There are other complications that get in the way of me working on a Verovio fork: tooling (I haven't programmed in C++ for quite a while), I have other priorities (I'm well into a new demo composition for my Assistant Performer, and need to finish it), my age... But I wouldn't rule the idea out completely. Maybe next year some time...


@craigsapp

First, here's a brief summary of the Web MIDI API. Its really very simple.

That's about it. There is no way that the Web MIDI API is going to change. Browsers have to be very economical with their code. Its just not designed for playing Standard MIDI Files.

......................... Synchronizing with audio/video

For audio recording to score alignment, you system seems good for cases where the real-timestamps are static, and I cannot think of any potential problems (other than mostly trivial complexities due to repeats). This would make it suitable for embedding alignments to audio/video recordings.

Maybe there should be a separate, simple contract for synchronizing with audio/video files?

......................... Synchronizing using XML embedded in the SVG

An interesting solution would be if the Web MIDI API were allowed to import SVG files of graphical music notation which contain the equivalent of a MIDI file.

"Graphical music notation" includes all music notations that have eventSymbols. That includes CWMN. :-)

There should also be a real-time tempo scaling interface built into Web MIDI API, as this would not be practical to implement in MEI, verovio or SVGs output from verovio.

The Web MIDI API is cast in stone! Speed changes can be done in Javascript (see below). :-)

My SVG scores are equivalent to MIDI files, except that "tempo" is constant 1000Hz, and there is no such thing as beat subdivision. Durations are integer milliseconds. So I'd like to go into a bit of detail here. I'm not trying to sell you a finished product, but I think it would be helpful if you understood the principles.

The eventSymbols contain XML defining MIDI events. So I can create MIDI messages that are directly connected to the graphics of the eventSymbols (which happen to look like CWMN chord symbols, and can be beamed.). In my (experimental) format, duration symbols (rests and chords) have msDuration (not msPosition re the start of a performance). The msDurations are actually only the default durations. They are the durations which happen when the score is played as written.

suppose you want to create a web interface where the user can control the tempo (or more precisely a tempo scaling) of the MIDI rendering? There is no practical way of implementing that within verovio or the SVG output.

I prefer the word speed rather than tempo here. The durations can, of course, be changed in Javascript to change the speed at which the score plays back. In my Assistant Peformer there is only a control for changing the performance speed globally, before the performance begins, but there's nothing to stop me implementing a speed control that the user could vary during a performance.

Any solution [...] must be able to handle repeats.

Repeats are not currently implemented in my solution, but it could be done by having a list of msPositions in each eventSymbol (as in my original proposal, above in this thread) rather than each symbol just having a simple msDuration. The [timestamp->eventSymbol] dictionary (created when the score loads) could then ensure that the performing eventSymbol is always on screen. Automatic scrolling is already implemented to keep the running cursor on the page, and it makes no real difference whether the page has to be scrolled up or down.

How should repeats be handled when an SVG element is turned on on one page and turned off on another?

My Assistant Performer scrolls automatically perfectly well, even if the elements are on different pages. Its just a question of calculating their vertical position with respect to the entire list of pages when the pages are placed vertically end to end. If we are about to scroll to a previous position then a check would have to be made to see which elements first have to be turned off. Doing _primavolta, _secundavolta differences would get a bit fiddly, but I'm sure its solvable -- even if it means using specialised attributes and repeat counting.

The main disadvantage of this approach is that it depends on defining an XML for the MIDI info.

......................... Synchronizing with raw MIDI info.

The <audio> and <video> HTML elements do not have a good callback system for updates related to highlighting notes, with the update callback not guaranteed or known to run at a specific rate. It is guaranteed to be no more than 250 ms, and web browsers all implement at this upper limit which is insufficient for score/audio alignment.

As I said above to @lpugin, I'm not sure if its really possible to synchronize an SVG score with an external Standard MIDI File being played in HTML.

Possibly, a solution like mine could be used (see above), but the MIDI messages could be included in numeric form in the eventSymbols rather than being the result of parsing XML. Each eventSymbol would be simply be given a sequence of midiMoments like this:

    <g class="eventSymbol" score:xalignment="1234.5678" ...>
        <score:midiMoments>
            <midiMoment msDuration="500">
                <midiMessage status="0x90" data1="0x3C" data2="0x40" \>
                <midiMessage status="0x90" data1="0x40" data2="0x40" \>
                <midiMessage status="0x90" data1="0x43" data2="0x40" \>
                <!-- more midiMessages can go here -->
            </midiMoment>
            <!-- more midiMoments can go here -->
        </score:midiMoments>

        <!-- the eventSymbol's graphics go here -->
    </g> <!-- end of eventSymbol --> 

Where msDuration is the number of milliseconds that should elapse until the following midiMoment. The midiMessage attributes can be read into a Uint8Array and sent directly to the output device. It would, of course, be possible to express this XML more succinctly, and in a form that would be quicker to parse. This example is just to get the idea across.

The advantage of starting with numeric MIDI info, rather than an XML, is of course that the standard already exists. The disadvantage is, as always when using binary or numeric info, that apps get more difficult to debug...

I see a possible problem with file-bloat where Continuous Controller info is involved. Maybe some optimisation could be done there...

Note that these solutions depend both on the Web MIDI API and having an output device. The Web MIDI API is implemented natively in Chrome, Opera and Android, and seems to be fairly close to completion in Firefox. Software output devices are a bit thin on the ground since the standard interface for them still has to be discussed. That's coming in v2 of the Web MIDI API. Nevertheless, I've made one of my own to be getting on with. Could the Verovio midiplayer be converted to be such a MIDI output device? I don't know how powerful it is. Can it load soundfonts? Can it sound like anything other than a piano? Play more than one sound at a time?

BTW: If you want to test the Assistant Performer by playing the Pianola Music score, please excuse all the ledgerlines. I could have put lots of clef changes in, but decided against it. The speed can be changed (globally) on the app's first page. :-) Its a real pity that Study 2 currently needs a plugin output device...

serejahh commented 7 years ago

hi @craigsapp. Could you help me how I could get the qstamp and tstamp from MusicXML?

craigsapp commented 7 years ago

Do you mean how to extract them from a MusicXML file directly or via verovio? I will assume the former for now. Currently it is not easy to export into MEI, but when the @type in MEI to @class in SVG is implemeted, they can be passed through as class data (and/or the score: system that @notator describes above could also be implemented to pass this data to the SVG image).

MusicXML files have a double description of rhythmic values, and one of these descriptions behaves in a similar manner to MIDI ticks. So from this system you can calculate qstamps moderately easy.

Firstly, there is an element in each MusicXML part called <divisions> This element is usually inside the first <attributes> of a part. The divisions element contains an integer which indicates how many ticks (duration units) occur per quarter note.

Here is an example case where the divisions value (ticks-per-quarter-note) is set to 4:

https://github.com/craigsapp/musicxml2hum/blob/3bee5ecd03949674292768ec2130d2910a1d9247/tests/sixteenths2.xml#L81

<attributes>
        <divisions>4</divisions>
        <key>
          <fifths>2</fifths>
          </key>
        <time symbol="common">
          <beats>4</beats>
          <beat-type>4</beat-type>
          </time>
        <clef>
          <sign>G</sign>
          <line>2</line>
          </clef>
</attributes>

Items that contain <duration> elements use that divisions value to describe the duration of the element in terms of quarter notes. Here is an example note:

https://github.com/craigsapp/musicxml2hum/blob/3bee5ecd03949674292768ec2130d2910a1d9247/tests/sixteenths2.xml#L106-L116

      <note default-x="99.59" default-y="-15.00">
        <pitch>
          <step>C</step>
          <alter>1</alter>
          <octave>5</octave>
          </pitch>
        <duration>4</duration>
        <voice>1</voice>
        <type>quarter</type>
        <stem>down</stem>
        </note>

The line <duration>4</duration> gives the duration of the note in "division" units. In other words, the duration is 4, and the ticks-per-quarter (divisions) is "4", so this note is one quarter note: 4/4 = 1 quarter note. (You can also see here the second duration system which is visually based, saying that this note should look like a quarter notes in the <type> element. In theory these two systems are equivalent, but Sibelius MusicXML export adds round-off error to the tick system).

To generate qstamps (quarter-note units since the start of the music) you read through the part, keeping track of the duration of each note or duration-containing element. If you do not divide by the divisions, that could be called the "tickstamp" (similar to the tick values used in MIDI files). In theory, the <divisions> value can change within a part, but I have never seen a MusicXML file that does that. If divisions ever does change, then the qstamp is more important than the tickstamp; otherwise they are equivalent with a factor of "divisions" scaling between them.

As you read through the part, there are two main complications:

(1) skip <duration> data for any note containing <chord/>. These are secondary chord notes, and do not progress the time pointer (only the first note in a chord).

(2) particularly for piano music, there are <backup> and <forward> command elements which move the time pointer forwards and backwards in time, You have to also add the duration of these elements to the running total.

Example of a backup:

https://github.com/craigsapp/musicxml2hum/blob/3bee5ecd03949674292768ec2130d2910a1d9247/tests/sixteenths.xml#L146-L148

      <backup>
        <duration>16</duration>
        </backup>

In this case you would have to subtract 16 ticks or 4 quarter notes from the running total, as the backup moves the time pointer back in time so that another voice can be written. To be robust, you should check that the time pointer points to the expected end of the measure when there are multiple voices in a measure (but don't worry about fixing this until you need to), to avoid problems with under/overfull measures.

One caveat is that Sibelius MusicXML files always have a power-of-two divisions even if there are tuplets. This will cause minor roundoff errors in tick/qstamp positions of notes, which is usually not important.


To calculate tstamps (physical times in seconds or milliseconds for each qstamp or tickstamp), you would also keep track of all <sound> elements in the part, and note at what qstamp/tickstamp they start at.

Example tempo marking:

https://github.com/craigsapp/musicxml2hum/blob/3bee5ecd03949674292768ec2130d2910a1d9247/tests/sixteenths.xml#L99

<sound tempo="112.5"/>

This means 112.5 quarter notes per minute, so starting at the current qstamp time, one quarter note has the duration of 60/112.5 = 0.533333 seconds, or 533 milliseconds.

Here is the code that I use for calculating tstamps from tickstamps in MIDI files:

https://github.com/craigsapp/midifile/blob/master/src-library/MidiFile.cpp#L2404-L2482

The process would be very analogous to calculating from MusicXML durations/tempo data.

This is the main part of the calculation:

 cursec = lastsec + (curtick - lasttick) * secondsPerTick;

Meaning: The tstamp for the current time is equal to the tstamp of the previous entry plus the duration in ticks from the previous entry times the number of seconds representing 1 tick. The secondsPerTick will be updated each time a new tempo marking is found.

The secondsPerTick is calculated from the tempo marking:

 double secondsPerTick = 60.0 / (tempo * tpq);

Meaning: the number of seconds in one tick is 60 divided tempo and also divided by the ticks-per-quarter-note. If the tempo is 112.5 and the divisions (tpq) is 4, then secondsPerTick will be 60 / 112.5 / 4 = 0.133333 seconds or 133 milliseconds.

You would probably start with the assumption that the tempo is 120 quarter notes per minute as a default, if there is no tempo marking at the start of a file. Also, if different parts have different tempo values, then that is not possible to process properly (so probably ignore if there is a conflict with the tempo in part P1).


tstamps can be calculated at the same time qstamps/tickstamps are being calculated, so you do not need to store qstamps if you do not want to.

serejahh commented 7 years ago

Thanks a lot for the so descriptive response!

pe-ro commented 7 years ago

Currently it is not easy to export into MEI

An MEI encoding can capture the same note-level info as MusicXML, albeit with different names and attributes instead of elements --

<note pname="c" oct="5" accid="s" dur="4" dur.ges="4p" stem.dir="down"/>

@dur.ges records the duration in terms of "pulses-per-quarter" instead of MusicXML's "divisions", but the meaning is exactly the same. The number of pulses-per-quarter is variable and can be recorded using the @ppq attribute on a <scoreDef> or <staffDef> ancestor.

@tstamp.ges holds the results of the calculation of the onset time --

<note pname="c" oct="5" accid="s" dur="4" dur.ges="4p" stem.dir="down" tstamp.ges="0.533333s"/>

If there's any difficulty in this, then it lies in MusicXML's use of <forward> and <backup>, not in MEI.

craigsapp commented 7 years ago

Currently it is not easy to export into MEI

OK, it should be easy. I was thinking more about how to get it into the SVG output. Calculating absolute timestamps (cumulative durations from the start of the music) will be more difficult in MEI, but not too much more. In MusicXML (and in MuseData), there is a single time pointer, so calculating the time from the start of the file is stateless (the time of the current element is always calculatable from the time of the previous item in the part). MEI has an independent time pointer for each <layer> element.

If there's any difficulty in this, then it lies in MusicXML's use of <forward> and <backup>, not in MEI.

That is not a problem in this case. But the forward/backup system, particularly for piano music, is difficult to translate from MusicXML's staff/voice attribute system into the staff/layer element system in MEI. @rettinghaus: that is one reason I have my own MusicXML to Humdrum converter, and I cannot imagine that it is handled correctly in the MusicXML to MEI direct importer. I just checked, and I see the direct importer cannot handle backup/forwards content too well:

screen shot 2017-01-06 at 08 31 34

Compare to the indirect importer via -f musicxml-hum from the same MusicXML input:

screen shot 2017-01-06 at 08 31 39

Intended notation in source:

screen shot 2017-01-06 at 08 37 11

Being able to handle <backup>/<forward> in my MusicXML to Humdrum converter doubled the complexity of the program...

pe-ro commented 7 years ago

@craigsapp,

Being able to handle <backup>/<forward> in my MusicXML to Humdrum converter doubled the complexity of the program...

Only doubled? 😀

Can you provide the MusicXML source for this example? I'd like to check how the musicxml2mei XSLT deals with it.

craigsapp commented 7 years ago

Here it is:

mazurka17-4.zip

generated from the bitmap score:

mazurka17-4.pdf

pe-ro commented 7 years ago

Thanks.

Here's the markup of m. 43 produced by the musicxml2mei stylesheet --

<measure n="43" xml:id="d1e10658" xmlns="http://www.music-encoding.org/ns/mei"
  xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:svg="http://www.w3.org/2000/svg"
  xmlns:xlink="http://www.w3.org/1999/xlink">
  <staff n="1">
    <layer n="1">
      <beam>
        <note xml:id="d1e10668" pname="b" accid="f" oct="4" dur="8" stem.dir="down">
          <accid accid="f"/>
        </note>
        <note xml:id="d1e10777" pname="a" oct="4" dur="8" stem.dir="down"/>
      </beam>
      <rest xml:id="d1e10796" dur="2" ploc="c" oloc="4"/>
    </layer>
    <layer n="2">
      <beam>
        <note xml:id="d1e10695" pname="b" accid="f" oct="4" dur="8" stem.dir="up">
          <accid accid="f"/>
        </note>
        <note xml:id="d1e10724" pname="e" oct="5" dur="8" stem.dir="up"/>
        <note xml:id="d1e10748" pname="d" oct="5" dur="8" stem.dir="up"/>
      </beam>
      <chord xml:id="d1790e1" dur="4" stem.dir="up" artic="stacc">
        <artic artic="stacc"/>
        <note xml:id="d1e10814" pname="b" accid="n" oct="4">
          <accid accid="n"/>
        </note>
        <note xml:id="d1e10836" pname="g" accid="s" oct="4">
          <accid accid="s"/>
        </note>
      </chord>
      <chord xml:id="d1796e1" dur="4" stem.dir="up" artic="stacc">
        <artic artic="stacc"/>
        <note xml:id="d1e10864" pname="b" oct="4" accid.ges="n"/>
        <note xml:id="d1e10883" pname="g" oct="4" accid.ges="s"/>
      </chord>
    </layer>
  </staff>
  <staff n="2">
    <layer n="1">
      <beam>
        <note xml:id="d1e10907" pname="g" oct="4" dur="8" stem.dir="up" accid.ges="s"/>
        <note xml:id="d1e10933" pname="f" accid="s" oct="4" dur="8" stem.dir="up">
          <accid accid="s"/>
        </note>
        <note xml:id="d1e10961" pname="f" accid="n" oct="4" dur="8" stem.dir="up">
          <accid accid="n"/>
        </note>
      </beam>
      <note xml:id="d1e10989" pname="e" oct="4" dur="4" artic="stacc" stem.dir="up">
        <artic artic="stacc"/>
      </note>
      <note xml:id="d1e11009" pname="e" oct="4" dur="4" artic="stacc" stem.dir="up">
        <artic artic="stacc"/>
      </note>
    </layer>
  </staff>
  <tupletSpan tstamp="1" startid="#d1e10907" num="3" numbase="2" num.format="count"
    tstamp2="0m+1.6667" endid="#d1e10961" staff="1 2"/>
  <beamSpan tstamp="1" startid="#d1e10668" tstamp2="0m+1.6667" endid="#d1e10748" staff="1"/>
  <!--@plist couldn't be added automatically-->
  <dynam label="direction" tstamp="1" place="below" staff="1">pp</dynam>
  <slur tstamp="1" startid="#d1e10668" curvedir="above" tstamp2="0m+1.6667" endid="#d1e10748"
    staff="1"/>
  <tupletSpan tstamp="1" startid="#d1e10695" num="3" numbase="2" num.format="count"
    tstamp2="0m+1.6667" endid="#d1e10748" staff="1"/>
  <slur tstamp="1" startid="#d1e10907" curvedir="below" tstamp2="0m+1.6667" endid="#d1e10961"
    staff="2"/>
  <slur tstamp="2" startid="#d1e10836" curvedir="below" tstamp2="0m+3" endid="#d1e10883" staff="1"/>
  <slur tstamp="2" startid="#d1e10989" curvedir="below" tstamp2="0m+3" endid="#d1e11009" staff="2"/>
</measure>

Aside from the fact that the beams take precedence over the tuplets, it's fine. In fact, it captures the "weirdness" in the stems-down tuplet on beat 1 without attempting to resolve it, which I think is desirable.

Running this through the addTiming stylesheet produces a bunch of hogwash, but this particular stylesheet never got beyond the alpha stage. It would be great if someone took it up again.

k-ljung commented 7 years ago

@craigsapp, thanks for the detailed answer to @serejahh, it is my task to implement this but I have some questions.

Is it enough that I loop through only the notes of the first <part> to create my qstamps lookup table, my MusicXML file contains four <part>?

How do I map a <note> from MusicXML to the <gelement in the svg, I don’t have any matching ID:s?

Do you think it is a better idea to create a mapping table that contains the notes to be switched on and off at a certain time, as your example:

{
    "on": {
        "000.0": ["note1", "note2"]
    },
    "off": {
        "010.0": ["note1"],
        "020.0": ["note2"]
    }
}

Then I guess I need to loop through all the <notes> in all parts?

craigsapp commented 7 years ago

Is it enough that I loop through only the notes of the first to create my qstamps lookup table, my MusicXML file contains four <part>?

Yes, you will have to look at all parts. This is because some parts may have a note at a particular time, while others may not. Here is an example where the two parts sometimes play at the same time and sometimes play by themselves, with the composite rhythm formed by the two parts given underneath:

screen shot 2017-01-09 at 14 30 21

You are probably wanting the timings of the bottom line which is a union of the times for each individual part.

There will be exceptions to this, but only for special cases -- mainly if the tempo is constant, you can calculating intermediate physical time stamps based on the qstamp positions of notes, but this probably does not apply to your case.

How do I map a from MusicXML to the <gelement in the svg, I don’t have any matching ID:s?

That is a key problem. You are talking about Verovio SVG output, right? And using the direct MusicXML importer? Verovio will add xml:ids which are based on random numbers into the internally converted MEI data. These ids are matched to the output SVG <g> elements representing that particular MEI element. However, random number ids are not useful for mapping back to the source MusicXML data. The main thing that I can think of is that the MusicXML importer should allow adding xml:ids in a systemmatic manner, such as enumerating all element nodes in the xml tree. Alternately, ids could be added to the MusicXML (or at least faked if not strictly allowed), and then these would be passed through to the MEI data and then the SVG data. Both of these solutions would involve modifying the current MusicXML importer.

I use a similar solution in the Humdrum to MEI converter. In that case, I embed a line/column number into the MEI/SVG xml:id which is the source location in the original file. From this embedded location, I know where in the Humdrum file the SVG element comes from, and then I can manipulate the SVG image directly from Humdrum instead of using MEI as an intermediary. That is how this example is implemented: http://www.humdrum.org/vhv-demos/recordings Verovio generates the SVG using xml:ids which point to the original Humdrum data locations for the note, and then I construct the timemaps from the Humdrum data rather than the MEI data.

This problem could also be solved by MEI generating SVG data with embedded qstamps (or tickstamps) for the note on and off times. These could then be calculated independently from the MusicXML data and mapped to a physical time without needing to refer/rely on the intermediary MEI data. (This is a feature I would like in verovio SVG output).

Do you think it is a better idea to create a mapping table that contains the notes to be switched on and off at a certain time, as your example:

{
"on": {
"000.0": ["note1", "note2"]
},
"off": {
"010.0": ["note1"],
"020.0": ["note2"]
}
}

Then I guess I need to loop through all the in all parts?

Yes, I do that, although slightly differently I think. You can have a separate timemap for each part if you want, but that is not necessary (and keeping track of separate timemaps will make it more likely to introduce a bug). I have a single timemap for all notes, and also interleave the on- and off-time elements, something like this:

{
   {"tstamp": 0.0, "qstamp":0.0, "on":["note1", "note2"], "off":[]},
   {"tstamp":10.0,  "qstamp":1.0, "on":[],  "off":["note1"]},
   {"tstamp":20.0,  "qstamp":2.0",  "on":[], "off":["note2"]}
}

Then I have only a single time pointer to keep track of. I keep this lised sorted, and then keep track of the last time which was used so that I can start searching for the next action time from that point (which is more efficient than searching for a particular timestamp from scratch each time). And if you have a large score embedded on the page, you can also store references to the SVG elements in the time entries rather than just their ids to improve efficiency (but that won't make much difference if you page your SVG dynamically).

k-ljung commented 7 years ago

@craigsapp, from my point of view it would be best if Verovio could generate the timestamp json structure as you suggested earlier in the thread, do you think this is possible and is this something you could implement?

What I understand is that most of the functionality is already implemented in Verovio?

craigsapp commented 7 years ago

@craigsapp, from my point of view it would be best if Verovio could generate the timestamp json structure as you suggested earlier in the thread, do you think this is possible and is this something you could implement?

That would be a good academic exercise for me. I have seen a JSON C++ library in the verovio source code, but I have not paid any attention to it yet. Laurent could give me an outline of how to proceed (particular in how to output the JSON data to JavaScript, which I think is now done with JSON.stringify handled by the toolkit), and I should be able to figure it out.

We can negotiate an output format in the meantime. Here is my current proposal:

[
   {"qstamp":0.0,     "tstamp": 0,    "on":["note1", "note2"], "off":[]},
   {"qstamp":1.0,     "tstamp":1000,  "on":[],                 "off":["note1"]},
   {"qstamp":2.0",    "tstamp":2000,  "on":[],                 "off":["note2"]}
]

qstamp is a floating-point number representing quarter note units from the start of the music until the start of the current on/off time.

tstamp is an integer representing time in milliseconds according to tempo messages found within the MEI data (or MM120 otherwise). Is tstamp OK? as this has a different meaning as an attribute in MEI element (where it is usually score-time base rather than real-time based).

Then there would be two arrays of xml:ids, one labeled on and the other labeled off, which contain ids for notes (or possibly other items) to turn on/off at those timestamps. If the on/off array is empty, that parameter should probably be omitted from the entry.

Other possible things to add (which are not essential):

tempo == the tempo starting at a particular timestamp. Including this information makes inclusion of tstamp optional, since qstamp + tempo can be used to derived tstamp. Inclusion of tempo in the list would allow for recalculating tstamp data if tempos change, and you don't want to re-generate the timemap from MEI/verovio.

Parameters listing the current measure and beat might be interesting (this could be used to highlight all notes in a particular measure or beat for example). Measure Ids could also be included in the on/off list as well.

Perhaps an on/off time for pages should be encoded in the list, either an indication of what page the current timestamp refers to, or an indication of when the page starts and stops in the time map (similar to the on/off of notes).

And how could/should it be linked to the current export of MIDI data? If it is linked to the MIDI export, there could be tick timestamps in addition to or replacing the qstamps, with the tick units matching the ticks-per-quarter in the MIDI file header. In that case a tpq parameter should be given on the first entry in the timemap so qstamps could be calculated.

This data structure could also manage the querying of the verovio data structure internally for a list of elements at a given time.

Gracenotes are tricky: they are currently treated as zero-duration items. Using score-time to describe them would have problems if they are rendered properly in the MIDI export with non-zero durations.

In terms of technical aspects, I would want to work on calculating the qstamps as rational numbers (such as "1/3" representing the duration of a triplet eighth note, rather than 0.333 to avoid round-off errors). I have a class called HumNum for that in the Humlib code:

https://github.com/craigsapp/humlib/blob/master/include/HumNum.h https://github.com/craigsapp/humlib/blob/master/src/HumNum.cpp

How would I access that in verovio? It has no dependencies other than to STL classes. Duplicating the code into the vrv namespace?

k-ljung commented 7 years ago

My knowledge in the field of scores and music arrangement is quite limited, I am working currently with a project that will display the scores synchronized with one or more music tracks (parts), each music track is an mp3 file.

I came across Verovio and think this is a super interesting project to display scores.

Our source for the scores is MusicXML files and I do not quite understand the difference between MusicXML and MEI? Verovio can translate MusicXML to MEI as the solution for the timestamp based on the MEI will also work well with MusicXML?

I do not see any problem with your suggestion for json output.

An important aspect from my point of view is that you can generate both SVG and JSON timestamp from the command-line interface.

pe-ro commented 7 years ago

@craigsapp,

I don't want to start a flame war about the relative merits of JSON vs XML, but I would like to point out that something similar to your proposed JSON structure is already available in MEI --

<recording>
  <when absolute="0.0" data="#n01 #n02" type="on" abstype="qstamp"/>
  <when absolute="0" data="#n01 #n02" type="on" abstype="tstamp"/>
  <when absolute="1.0" data="#n01" abstype="qstamp" type="off"/>
  <when absolute="1000" data="#n01" abstype="tstamp" type="off"/>
  <when absolute="2.0" data="#n02" abstype="qstamp" type="off"/>
  <when absolute="2000" data="#n02" abstype="tstamp" type="off"/>
</recording>

Each <when> element represents a point in time. The @absolute attribute records the time value and @abstype states the unit of measurement of the time value. "qstamp" is not a legitimate value at present, but it could be. @type can be used to classify the function of the captured data; that is, as an "on" or "off" value.

What isn't possible is capturing multiple values/value types with a single element. But, this can be compensated for by using multiple <when> elements; that is, a set of elements for tstamp values and another for qstamp values. If interleaving differing value types is undesirable, the values can be grouped by type into multiple <recording> containers --

<recording betype="qstamp">
  <when absolute="0.0" data="#n01 #n02" type="on"/>
  <when absolute="1.0" data="#n01" type="off"/>
  <when absolute="2.0" data="#n02" type="off"/>
</recording>
<recording betype="tstamp">
  <when absolute="0" data="#n01 #n02" type="on"/>
  <when absolute="1000" data="#n01" type="off"/>
  <when absolute="2000" data="#n02" type="off"/>
</recording>

Since the unit of measure is the same for all points within each <recording>, it can be provided on the outer container using @betype, making the encoding somewhat less verbose.

I think the main advantage of this approach is that the data can be meaningfully embedded within the MEI itself, which is not true of JSON.

k-ljung commented 7 years ago

@pe-ro It does not matter to me if output is json or xml, if MEI format already supports what I need this is enough for me.

I converted my MusicXML source file to MEI format but cannot se any <when> elements.

Do I need to add some extra parameters when converting my musicxml?

pe-ro commented 7 years ago

@k-ljung, Conversion to MEI doesn't automatically create these elements. I leave it to the developers of Verovio to decide if this is something they wish to support. I was merely pointing out that it can be done. :-)

k-ljung commented 7 years ago

After reading the above thread back and forth, I get the feeling that it is possible to calculate and create a timestamp mapping from my MEI file?

@pe-ro, are you able to describe in more detail how I could do this from an MEI file?

craigsapp commented 7 years ago

I don't want to start a flame war about the relative merits of JSON vs XML, but I would like to point out that something similar to your proposed JSON structure is already available in MEI --

I'll get out my asbestos suit... But not to worry: what you want is storage of the timemap in MEI data. This is a separate issue at the moment. @lpugin has been tricking you, because verovio does not internally use an XML or MEI data structure, so there is no place at the moment where I can store the timemap data in verovio's quasi-MEI data structure, and there is also no mechanism (in iomei.cpp) to convert that into MEI content.

After reading the above thread back and forth, I get the feeling that it is possible to calculate and create a timestamp mapping from my MEI file?

I have implemented an initial timemap output for the command line (which I will put onto Github soon). The initial implementation's main limitation is that it does not handle <mRest>s (if they occur in all parts on a system) or <multiRest>s. I am calculating the timemap from the quasi-MEI data structure in verovio. It is possible to calculate and create a timestamp mapping from a MEI file (which is what I basically just did).

craigsapp commented 7 years ago

Here is the commit for and documentation about the new timemap feature, which is located in the develop-timemap branch:

https://github.com/rism-ch/verovio/commit/52dec1ea7a2ba6872654a1ba05a5b595a6107201

@lpugin can re-arrange it as he likes. I don't see how functors will work (in relation to mrests and multirests in particular), and setup the recording/when storage of the timemap in the MEI output.

pe-ro commented 7 years ago

@k-ljung, Sorry, I'm the "model guy", not a programmer. So, I'll leave the programming guidance to Craig and Laurent. 😄

craigsapp commented 7 years ago

Timemaps as discussed in this thread should now be implemented in a way that @lpugin approves of. The code is currently in the develop-midi-timemap branch (and the develop-humdrum branch), and should be merged soon into the develop branch.

The timemap output shares timing values for notes with the MIDI file creation system, so the numbers from MIDI files and timemap data should match exactly.

Here is how to create JSON timemap files from the command-line:

verovio input.ext -t timemap

This will create a file called input.json which contains the timemap. The timemap can be sent to standard output instead of a file with the command:

verovio input.ext -t timemap -o -

Example input and output:

screen shot 2017-06-23 at 12 47 09 am

The first measure is at MM60, the second at 120MM, and the third at 121MM.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://music-encoding.org/schema/3.0.0/mei-all.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://music-encoding.org/schema/3.0.0/mei-all.rng" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<mei xmlns="http://www.music-encoding.org/ns/mei" meiversion="3.0.0">
    <meiHead>
        <fileDesc>
            <titleStmt>
                <title />
            </titleStmt>
            <pubStmt />
        </fileDesc>
        <encodingDesc>
            <appInfo>
                <application isodate="2017-06-23T00:44:41" version="2.0.0-dev-3a11749">
                    <name>Verovio</name>
                    <p>Transcoded from Humdrum</p>
                </application>
            </appInfo>
        </encodingDesc>
        <workDesc>
            <work>
                <titleStmt>
                    <title />
                </titleStmt>
            </work>
        </workDesc>
    </meiHead>
    <music>
        <body>
            <mdiv>
                <score>
                    <scoreDef xml:id="scoredef-0000000868870589" midi.bpm="60">
                        <staffGrp xml:id="staffgrp-0000000745463070">
                            <staffDef xml:id="staffdef-0000000578220892" clef.shape="G" clef.line="2" meter.count="4" meter.unit="4" n="1" lines="5" />
                        </staffGrp>
                    </scoreDef>
                    <section xml:id="section-0000000987219856">
                        <measure xml:id="measure-L4" n="1">
                            <staff xml:id="staff-L4F1N1" n="1">
                                <layer xml:id="layer-L4F1N1" n="1">
                                    <note xml:id="note-L5F1" dur="4" oct="4" pname="c" accid.ges="n" />
                                    <note xml:id="note-L6F1" dur="4" oct="4" pname="d" accid.ges="n" />
                                    <note xml:id="note-L7F1" dur="4" oct="4" pname="e" accid.ges="n" />
                                    <note xml:id="note-L8F1" dur="4" oct="4" pname="f" accid.ges="n" />
                                </layer>
                            </staff>
                        </measure>
                        <measure xml:id="measure-L9" n="2">
                            <staff xml:id="staff-L9F1N1" n="1">
                                <layer xml:id="layer-L9F1N1" n="1">
                                    <note xml:id="note-L11F1" dur="4" oct="4" pname="c" accid.ges="n" />
                                    <note xml:id="note-L12F1" dur="4" oct="4" pname="d" accid.ges="n" />
                                    <note xml:id="note-L13F1" dur="4" oct="4" pname="e" accid.ges="n" />
                                    <note xml:id="note-L14F1" dur="4" oct="4" pname="f" accid.ges="n" />
                                </layer>
                            </staff>
                            <tempo midi.bpm="120"/>
                        </measure>
                        <measure xml:id="measure-L15" right="end" n="3">
                            <staff xml:id="staff-L15F1N1" n="1">
                                <layer xml:id="layer-L15F1N1" n="1">
                                    <chord xml:id="chord-L17F1" dur="4">
                                        <note xml:id="note-L17F1S1" oct="4" pname="c" accid.ges="n" />
                                        <note xml:id="note-L17F1S2" oct="4" pname="d" accid.ges="n" />
                                        <note xml:id="note-L17F1S3" oct="4" pname="g" accid.ges="n" />
                                    </chord>
                                    <chord xml:id="chord-L18F1" dur="4">
                                        <note xml:id="note-L18F1S1" oct="4" pname="d" accid.ges="n" />
                                        <note xml:id="note-L18F1S2" oct="4" pname="f" accid.ges="n" />
                                        <note xml:id="note-L18F1S3" oct="4" pname="a" accid.ges="n" />
                                    </chord>
                                    <chord xml:id="chord-L19F1" dur="4">
                                        <note xml:id="note-L19F1S1" oct="4" pname="e" accid.ges="n" />
                                        <note xml:id="note-L19F1S2" oct="4" pname="g" accid.ges="n" />
                                        <note xml:id="note-L19F1S3" oct="4" pname="b" accid.ges="n" />
                                    </chord>
                                    <chord xml:id="chord-L20F1" dur="4">
                                        <note xml:id="note-L20F1S1" oct="4" pname="f" accid.ges="n" />
                                        <note xml:id="note-L20F1S2" oct="4" pname="a" accid.ges="n" />
                                        <note xml:id="note-L20F1S3" oct="4" pname="c" accid.ges="n" />
                                    </chord>
                                </layer>
                            </staff>
                            <tempo midi.bpm="121"/>
                        </measure>
                    </section>
                </score>
            </mdiv>
        </body>
    </music>
</mei>

The resulting timemap:

[
    {
        "tstamp":   0,
        "qstamp":   0.000000,
        "tempo":    60,
        "on":   ["note-L5F1"]
    },
    {
        "tstamp":   1000,
        "qstamp":   1.000000,
        "on":   ["note-L6F1"],
        "off":  ["note-L5F1"]
    },
    {
        "tstamp":   2000,
        "qstamp":   2.000000,
        "on":   ["note-L7F1"],
        "off":  ["note-L6F1"]
    },
    {
        "tstamp":   3000,
        "qstamp":   3.000000,
        "on":   ["note-L8F1"],
        "off":  ["note-L7F1"]
    },
    {
        "tstamp":   4000,
        "qstamp":   4.000000,
        "tempo":    120,
        "on":   ["note-L11F1"],
        "off":  ["note-L8F1"]
    },
    {
        "tstamp":   4500,
        "qstamp":   5.000000,
        "on":   ["note-L12F1"],
        "off":  ["note-L11F1"]
    },
    {
        "tstamp":   5000,
        "qstamp":   6.000000,
        "on":   ["note-L13F1"],
        "off":  ["note-L12F1"]
    },
    {
        "tstamp":   5500,
        "qstamp":   7.000000,
        "on":   ["note-L14F1"],
        "off":  ["note-L13F1"]
    },
    {
        "tstamp":   6000,
        "qstamp":   8.000000,
        "tempo":    121,
        "on":   ["note-L17F1S1", "note-L17F1S2", "note-L17F1S3"],
        "off":  ["note-L14F1"]
    },
    {
        "tstamp":   6496,
        "qstamp":   9.000000,
        "on":   ["note-L18F1S1", "note-L18F1S2", "note-L18F1S3"],
        "off":  ["note-L17F1S1", "note-L17F1S2", "note-L17F1S3"]
    },
    {
        "tstamp":   6992,
        "qstamp":   10.000000,
        "on":   ["note-L19F1S1", "note-L19F1S2", "note-L19F1S3"],
        "off":  ["note-L18F1S1", "note-L18F1S2", "note-L18F1S3"]
    },
    {
        "tstamp":   7488,
        "qstamp":   11.000000,
        "on":   ["note-L20F1S1", "note-L20F1S2", "note-L20F1S3"],
        "off":  ["note-L19F1S1", "note-L19F1S2", "note-L19F1S3"]
    },
    {
        "tstamp":   7983,
        "qstamp":   12.000000,
        "off":  ["note-L20F1S1", "note-L20F1S2", "note-L20F1S3"]
    }
]

The timemap is an array of JSON objects, with each entry haveing these keys:

The timemap is also available from the javascript toolkit for verovio from the vrvToolkit.renderToTimemap() function. Here is a demo of its usage with the sample data:

screen shot 2017-06-23 at 12 58 17 am

lpugin commented 7 years ago

Great! Two comments

We need to think about providing pagination information. Pagination can change independently from the timemap, so maybe we can have this as a distinct JSON object?

Could you add a page to the other formats section gathering the documentation provided in this commit? Or maybe directly in the midi page (http://www.verovio.org/midi.xhtml)? This would be helpful because otherwise it will be lost.

craigsapp commented 7 years ago

We need to think about providing pagination information. Pagination can change independently from the timemap, so maybe we can have this as a distinct JSON object?

I was thinking about that, and as you say, the pagination can change independent of the timemap, so a separate function to extract the pagination would be good. Maybe something like this:

var pagination = JSON.parse(vrvToolkit.getPaginationTimemap());

which would return something like:

[
    { "tstamp": 0, "page": 1},
    { "tstamp": 14022, "page": 2},
    { "tstamp": 30253, "page": 3},
    {"tstamp": 35235, "page": 2},
    {"tstamp":40123, "page":3},
    {"tstamp":48149, "page":4},
    {"tstamp":69325, "page":1}
]

Noce that the page number goes up or down depending on repeats and da capos and such.

Could you add a page to the other formats section gathering the documentation provided in this commit? Or maybe directly in the midi page (http://www.verovio.org/midi.xhtml)? This would be helpful because otherwise it will be lost.

I can do that, but I am going to Spain for a week tomorrow, and then other places for the following couple of weeks, so eventually :-)