Yujia-Yan / Transkun

A simple yet effective Audio-to-Midi Automatic Piano Transcription system
MIT License
121 stars 10 forks source link

Note Timing Issue in Transcribed MIDI Files #13

Open haveyouwantto opened 1 year ago

haveyouwantto commented 1 year ago

Hi there. I'm reaching out to report an issue with the latest version of Transkun. Although I'm not an expert in machine learning, I've noticed that when using Transkun to transcribe audio into MIDI files, the notes appear too close together. This results in a short and abrupt sound, regardless of the input audio or MIDI synthesizer used.

As a user, I rely on Transkun to generate accurate and usable MIDI files, so I wanted to bring this problem to your attention. I greatly appreciate your efforts in developing Transkun, and I kindly request your assistance in resolving this note timing issue.

Please let me know if you need any further information from me to address this matter. Thank you for your attention, and I look forward to your prompt response.

Yujia-Yan commented 1 year ago

Hi, can you provide some demonstrations like screenshots, audios, midi files, so that I can grab an idea of what the issue is?

haveyouwantto commented 1 year ago

Sure, here is an example MIDI file transcribed by Transkun that demonstrates this issue. It's worth noting that this effect is more pronounced in MIDI systems with higher latency. https://file.io/tqs8waOzGcAU

Yujia-Yan commented 1 year ago

Hi, I do not hear any abnormality from my piano synthesizer (pianoteq). It maybe implementation dependent. I guess what you are referring to is that certain boundaries of notes are close together? If so, it's a known treatment of all previous papers. They extend all pedal notes to reflect the sounding duration according to the pedal, which often results in very close note boundary. I also followed this to make the results comparable. Is that what you mean?

haveyouwantto commented 1 year ago

Yes, however, I have tried using other transcription models that do not exhibit this issue. It appears that the problem lies in the close proximity of the note off and note on events in the transcribed MIDI files.

Yujia-Yan commented 1 year ago

The code actually uses an unusually large MIDI ticks. I wonder if it is causes the problem for some software. What software are you using?

haveyouwantto commented 1 year ago

I am using FluidSynth as midi synthesizer

haveyouwantto commented 1 year ago

Now I am unable to consistently reproduce this bug. There may have been a misunderstanding.

xavriley commented 9 months ago

I also suspect this is because the resolution transkun is very high (32k - for comparison Logic Pro X uses a resolution of 480). If this causes an issue you can copy the notes to a regular resolution file with PrettyMIDI

import pretty_midi as pm

orig_mid = pm.PrettyMIDI("path_to_transkun.mid")
new_mid = pm.PrettyMIDI(resolution=480)

new_mid.instruments.append(pm.Instrument(0))
for n in orig_mid.notes:
    new_mid.instruments[0].notes.append(n)

new_mid.write("path_to_lower_res_file.mid")

I haven't tested this code but it should work