Working with Nakamura match files

Although it is possible to load Nakamura match files using pt.load_nakamuramatch(matchfile) is is unclear how to get the arrays that are returned to work correctly with get_time_maps_from_alignment.

I am trying the following approach:

align, _, alignment = pt.load_nakamuramatch(matchfile)

# from importnakamura.py line 114 
perf_dtype = [
        ("onset_sec", "f4"),
        ("duration_sec", "f4"),
        ("pitch", "i4"),
        ("velocity", "i4"),
        ("channel", "i4"),
        ("id", "U256"),
]

note_array = np.array(align, dtype=perf_dtype)
performed_part = pt.performance.PerformedPart.from_note_array(note_array)

# Get score time to performance time map
_, stime_to_ptime_map = get_time_maps_from_alignment(performed_part, score_part, alignment)

However, I get the following error:

line 740, in get_time_maps_from_alignment
    score_onsets = score_note_array[match_idx[:, 0]]["onset_beat"]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

The docs for performance.from_note_array() may need to be updated since the following information about the structured array fields is not the same as what is in the Partitura codebase, i.e., ‘onset_div’, ‘duration_div’ are not used in the method definition and 'channel' is in fact a valid field. This led to some confusion where I created a new structured array with the 'channel' field removed.

note_array(*args, **kwargs) → ndarray[[source]](https://partitura.readthedocs.io/en/latest/_modules/partitura/performance.html#PerformedPart.note_array)
Structured array containing performance information. The fields are ‘id’, ‘pitch’, ‘onset_div’, ‘duration_div’, ‘onset_sec’, ‘duration_sec’ and ‘velocity’.

@classmethod
    def from_note_array(
        cls,
        note_array: np.ndarray,
        id: str = None,
        part_name: str = None,
    ):
        """Create an instance of PerformedPart from a note_array.
        Note that this property does not include non-note information (i.e.
        controls such as sustain pedal).
        """
        if "id" not in note_array.dtype.names:
            n_ids = ["n{0}".format(i) for i in range(len(note_array))]
        else:
            # Check if all ids are the same
            if np.all(note_array["id"] == note_array["id"][0]):
                n_ids = ["n{0}".format(i) for i in range(len(note_array))]
            else:
                n_ids = note_array["id"]

        if "track" not in note_array.dtype.names:
            tracks = np.zeros(len(note_array), dtype=int)
        else:
            tracks = note_array["track"]

        if "channel" not in note_array.dtype.names:
            channels = np.ones(len(note_array), dtype=int)
        else:
            channels = note_array["channel"]

        notes = []
        for nid, note, track, channel in zip(n_ids, note_array, tracks, channels):
            notes.append(
                dict(
                    id=nid,
                    midi_pitch=note["pitch"],
                    note_on=note["onset_sec"],
                    note_off=note["onset_sec"] + note["duration_sec"],
                    sound_off=note["onset_sec"] + note["duration_sec"],
                    track=track,
                    channel=channel,
                    velocity=note["velocity"],
                )
            )

        return cls(id=id, part_name=part_name, notes=notes, controls=None)

Hi @eoinroe! Could you perhaps send the Nakamura match file that causes this issue? I believe the issue might be related to the Note ids, and how the alignment tool by Nakamura et al. uses them in their match file.

In any case, if you using Nakamura et al.'s tool for alignment, you might be interested in trying a different library, Parangonar, which is the currently the SOTA for symbolic alignment and plays better with Partitura.

Hi @neosatrapahereje, yes I realised the issue is related to the note ids as you say since the result of match_idx = get_matched_notes(score_note_array, perf_note_array, alignment) was an empty array.

Nakamura uses this format for note ids: P1-1-1, P1-1-6 etc. This is what he says about them in the Manual for his Symbolic Music Alignment Tool: ”The note ID indicates a note in the reference score MusicXML file ex_ref.xml: Px-y-z means the note is the z th note in the y th bar of part x.“

So instead of loading the score using pt.load_musicxml() I tried to create it from the information returned by load_nakamuramatch() as you can see in the code snippet below.

# This returns the arrays in a different order from pt.load_match()
align, ref, alignment = pt.load_nakamuramatch(matchfile)

score_dtype = [
    ("onset_div", "i4"),
    ("pitch", "i4"),
    ("id", "U256"),
]

score_part_note_array = np.zeros(len(ref), dtype=score_dtype)

dt = np.dtype(score_dtype)
for field in dt.names:
    # Need to remove various fields from the structured array returned by the Nakamura match file
    score_part_note_array[field] = ref[field]

_, stime_to_ptime_map = get_time_maps_from_alignment(performed_part, score_part_note_array, alignment)

However, I don't think the Nakamura match files give you quite enough information about the score to create a Part properly. For example pt.musicanalysis.note_array_to_score() allows you to pass it a note array with the following fields [“onset_div”, “duration_div”, “pitch”] but the the score_dtype in load_nakamuramatch() only includes these fields:

score_dtype = [
        ("onset_div", "i4"),
        ("pitch", "i4"),
        ("step", "U256"),
        ("alter", "i4"),
        ("octave", "i4"),
        ("id", "U256"),
    ]

And I get the following error:

line 743, in get_time_maps_from_alignment
score_onsets = score_note_array[match_idx[:, 0]]["onset_beat"]
ValueError: no field of name onset_beat

In any case here is the Nakamura match file that is causing the issue - ex_align1_match.txt

This file was generated by running the C++ Alignment Tool found here

Thanks for the Parangonar recommendation. I discovered it last week after running into these issues and it works great. I thought it was still worth documenting the issues I ran into using Nakamura match files with Partitura for other users.

Hi @eoinroe ! Thank you for the issue! First of all: partitura uses structured arrays to pass around information where a simple "list of notes" suffices. The structured arrays for performances/performedparts and scores/parts are different:

scores: Structured array with fields are ‘id’, ‘pitch’, ‘onset_div’, ‘duration_div’, ‘onset_beat’, ‘duration_beat’, ‘onset_quarter’, ‘duration_quarter’, and possibly more depending on flags. Depending on the needs of the receiving function it might require some or all of these fields, e.g., in the case of get_time_maps_from_alignment the field onset_beat is required -> this generates your second error! As you noticed, creating such note_arrays ad hoc is not pain-free, but in your case, you could fix it by creating a dummy (float) field from onset div like so:
```
score_dtype = [
("onset_beat", "f4"),
("duration_beat", "f4"),
("onset_div", "i4"),
("pitch", "i4"),
("id", "U256"),
]
```

score_part_note_array = np.zeros(len(ref), dtype=score_dtype)

dt = np.dtype(score_dtype) for field in dt.names: if field == "onset_beat": score_part_note_array[field] = ref["onset_div"].astype(float) elif field == "duration_beat": score_part_note_array[field] = np.ones_like(ref["onset_div"]).astype(float) else: score_part_note_array[field] = ref[field]


notice that I also created a dummy field `duration_beat` which the time map function requires.

performances: Structured array with fields ‘id’, ‘pitch’, ‘onset_tick’, ‘duration_tick’, ‘onset_sec’, ‘duration_sec’ and ‘velocity’. Good catch with the error in the documentation there! We use ticks to refer to MIDI ticks or parts, and divs to refer to score units derived from MusicXML divs. The PerformedPart.from_note_array() is also limited in the sense that it expects timing information in seconds, which is converted to MIDI ticks using the standard 480 parts per quarter and 500000 microseconds per quarter (= 120 bpm), some timing precision might get lost in the float second -> int ticks conversion and of course, no tempo changes or the like are available. This is a bare-bones function mainly to be able to export MIDI files from generated note lists.

You have a peculiar use case here, but it looks like it can work. Let us know if you encounter more bugs and thanks for the documentation pointer!

CPJKU / partitura

Working with Nakamura match files #349