CPJKU / partitura

A python package for handling modern staff notation of music
https://partitura.readthedocs.io
Apache License 2.0
227 stars 15 forks source link

Working with Nakamura match files #349

Open eoinroe opened 7 months ago

eoinroe commented 7 months ago

Although it is possible to load Nakamura match files using pt.load_nakamuramatch(matchfile) is is unclear how to get the arrays that are returned to work correctly with get_time_maps_from_alignment.

I am trying the following approach:

align, _, alignment = pt.load_nakamuramatch(matchfile)

# from importnakamura.py line 114 
perf_dtype = [
        ("onset_sec", "f4"),
        ("duration_sec", "f4"),
        ("pitch", "i4"),
        ("velocity", "i4"),
        ("channel", "i4"),
        ("id", "U256"),
]

note_array = np.array(align, dtype=perf_dtype)
performed_part = pt.performance.PerformedPart.from_note_array(note_array)

# Get score time to performance time map
_, stime_to_ptime_map = get_time_maps_from_alignment(performed_part, score_part, alignment)

However, I get the following error:

line 740, in get_time_maps_from_alignment
    score_onsets = score_note_array[match_idx[:, 0]]["onset_beat"]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

The docs for performance.from_note_array() may need to be updated since the following information about the structured array fields is not the same as what is in the Partitura codebase, i.e., ‘onset_div’, ‘duration_div’ are not used in the method definition and 'channel' is in fact a valid field. This led to some confusion where I created a new structured array with the 'channel' field removed.

note_array(*args, **kwargs) → ndarray[[source]](https://partitura.readthedocs.io/en/latest/_modules/partitura/performance.html#PerformedPart.note_array)
Structured array containing performance information. The fields are ‘id’, ‘pitch’, ‘onset_div’, ‘duration_div’, ‘onset_sec’, ‘duration_sec’ and ‘velocity’.
@classmethod
    def from_note_array(
        cls,
        note_array: np.ndarray,
        id: str = None,
        part_name: str = None,
    ):
        """Create an instance of PerformedPart from a note_array.
        Note that this property does not include non-note information (i.e.
        controls such as sustain pedal).
        """
        if "id" not in note_array.dtype.names:
            n_ids = ["n{0}".format(i) for i in range(len(note_array))]
        else:
            # Check if all ids are the same
            if np.all(note_array["id"] == note_array["id"][0]):
                n_ids = ["n{0}".format(i) for i in range(len(note_array))]
            else:
                n_ids = note_array["id"]

        if "track" not in note_array.dtype.names:
            tracks = np.zeros(len(note_array), dtype=int)
        else:
            tracks = note_array["track"]

        if "channel" not in note_array.dtype.names:
            channels = np.ones(len(note_array), dtype=int)
        else:
            channels = note_array["channel"]

        notes = []
        for nid, note, track, channel in zip(n_ids, note_array, tracks, channels):
            notes.append(
                dict(
                    id=nid,
                    midi_pitch=note["pitch"],
                    note_on=note["onset_sec"],
                    note_off=note["onset_sec"] + note["duration_sec"],
                    sound_off=note["onset_sec"] + note["duration_sec"],
                    track=track,
                    channel=channel,
                    velocity=note["velocity"],
                )
            )

        return cls(id=id, part_name=part_name, notes=notes, controls=None)
neosatrapahereje commented 6 months ago

Hi @eoinroe! Could you perhaps send the Nakamura match file that causes this issue? I believe the issue might be related to the Note ids, and how the alignment tool by Nakamura et al. uses them in their match file.

In any case, if you using Nakamura et al.'s tool for alignment, you might be interested in trying a different library, Parangonar, which is the currently the SOTA for symbolic alignment and plays better with Partitura.

eoinroe commented 6 months ago

Hi @neosatrapahereje, yes I realised the issue is related to the note ids as you say since the result of match_idx = get_matched_notes(score_note_array, perf_note_array, alignment) was an empty array.

Nakamura uses this format for note ids: P1-1-1, P1-1-6 etc. This is what he says about them in the Manual for his Symbolic Music Alignment Tool: ”The note ID indicates a note in the reference score MusicXML file ex_ref.xml: Px-y-z means the note is the z th note in the y th bar of part x.“

So instead of loading the score using pt.load_musicxml() I tried to create it from the information returned by load_nakamuramatch() as you can see in the code snippet below.

# This returns the arrays in a different order from pt.load_match()
align, ref, alignment = pt.load_nakamuramatch(matchfile)

score_dtype = [
    ("onset_div", "i4"),
    ("pitch", "i4"),
    ("id", "U256"),
]

score_part_note_array = np.zeros(len(ref), dtype=score_dtype)

dt = np.dtype(score_dtype)
for field in dt.names:
    # Need to remove various fields from the structured array returned by the Nakamura match file
    score_part_note_array[field] = ref[field]

_, stime_to_ptime_map = get_time_maps_from_alignment(performed_part, score_part_note_array, alignment)

However, I don't think the Nakamura match files give you quite enough information about the score to create a Part properly. For example pt.musicanalysis.note_array_to_score() allows you to pass it a note array with the following fields [“onset_div”, “duration_div”, “pitch”] but the the score_dtype in load_nakamuramatch() only includes these fields:

score_dtype = [
        ("onset_div", "i4"),
        ("pitch", "i4"),
        ("step", "U256"),
        ("alter", "i4"),
        ("octave", "i4"),
        ("id", "U256"),
    ]

And I get the following error:

line 743, in get_time_maps_from_alignment
score_onsets = score_note_array[match_idx[:, 0]]["onset_beat"]
ValueError: no field of name onset_beat

In any case here is the Nakamura match file that is causing the issue - ex_align1_match.txt

This file was generated by running the C++ Alignment Tool found here

Thanks for the Parangonar recommendation. I discovered it last week after running into these issues and it works great. I thought it was still worth documenting the issues I ran into using Nakamura match files with Partitura for other users.

sildater commented 1 month ago

Hi @eoinroe ! Thank you for the issue! First of all: partitura uses structured arrays to pass around information where a simple "list of notes" suffices. The structured arrays for performances/performedparts and scores/parts are different:

score_part_note_array = np.zeros(len(ref), dtype=score_dtype)

dt = np.dtype(score_dtype) for field in dt.names: if field == "onset_beat": score_part_note_array[field] = ref["onset_div"].astype(float) elif field == "duration_beat": score_part_note_array[field] = np.ones_like(ref["onset_div"]).astype(float) else: score_part_note_array[field] = ref[field]


notice that I also created a dummy field `duration_beat` which the time map function requires.
sildater commented 1 month ago

You have a peculiar use case here, but it looks like it can work. Let us know if you encounter more bugs and thanks for the documentation pointer!