suzuqn / ScoreTransformer

The official repository for "Score Transformer: Generating Musical Scores from Note-level Representation" (MMAsia '21)
https://score-transformer.github.io
45 stars 5 forks source link

all of the bpm of musicxml after detokenizer are 120 #1

Open Chunyuan-Li opened 1 year ago

Chunyuan-Li commented 1 year ago

I noticed that in the direction element of MusicXML, there is a per-minute attribute within the direction-type element used to describe beats per minute (BPM). However, this attribute is missing in both the tokenizer and detokenizer processes. As a result, after detokenization, all the BPM values in the resulting MXL file are set to 120, which is clearly problematic.

suzuqn commented 11 months ago

Sorry for my super late response. Currently, the tools are not considering elements related to BPM or tempo, because these elements do not directly involved in the MIDI to Score conversion.

If these elements are necessary for your use case, I suggest considering custom extensions to the tokenizer and detokenizer:

Chunyuan-Li commented 11 months ago

Yes, I did try that as well. However, I found that the model struggles to predict the bpm accurately, regardless of whether I explicitly specify bpm in the midi tokens (from midi tempo changes). Eventually, I removed the bpm indicators.

Additionally, I encountered another issue on the model side: when training with non-standard midis (typically converted from audio) and standard musicxml as data pairs, the model's prediction performance deteriorated significantly, often leading to missing notes. Do you have any suggestions?

suzuqn commented 10 months ago

If your goal is to transcribe BPM from MIDI to Score, I think it's not necessary to include it in the token conversion process. Instead, you could simply append the BPM read from the MIDI as a tempo object to the transcribed score.

Your issue with "non-standard midis" seems similar to the case of unquantized (noisy) input described in Section 6.5 of my paper. The key might be in augmentation, regarding note timing and duration.

Chunyuan-Li commented 10 months ago

Indeed, it bears a resemblance to the addition of noise discussed in Section 6.5 of the paper. I conducted a comparison using standard MIDI files, altering parameters such as noise ratio and range (duration). I observed that as noise increased, the model gradually started forgetting notes in musicxml. Additionally, in non-standard MIDI files, there were instances of note errors, making it seemingly more challenging for the model to learn. In light of this situation, do you have any suggestions for effective solutions?