magenta / mt3

MT3: Multi-Task Multitrack Music Transcription
Apache License 2.0
1.41k stars 185 forks source link

How to solve the problem of 'same sound, different name', where notes with different pitches under different keys sound the same? #120

Closed Chunyuan-Li closed 1 year ago

Chunyuan-Li commented 1 year ago

I noticed that during model training and generation, audio is split into multiple independent parts and generated based on a greedy strategy. However, in reality, there may be cases of 'same sound, different name', where notes with different pitches under different keys sound the same, and the independent splitting approach ignores this information because one independent part cannot see other parts. This results in generated MIDI that sounds very authentic, but with many incorrect notes. Is there a way to solve this?