SOTA drum transcription using this model

xavriley commented 1 month ago

Hi - thanks so much for releasing the 6-stem drum separation model. I've been looking at drum transcription (audio-to-midi) as part of my PhD research and after playing around with this model for a couple of days I have something that looks like SOTA on the MDB drum dataset! (82.5% F1 - a 1.5% improvement on previous SOTA)

Would you be interested in working on a paper together to describe the details? I've written some postprocessing code to extract the onsets and clean things up a little but I feel like you and @aufr33 deserve the credit for its success. I'm happy to take care of writing up the paper and doing evaluations.

jarredou commented 1 month ago

Hi, and... Wow, I wouldn't thought that this small experimentation could lead to this ! Amazing :)

I think it can be even better as this was kinda only a test run, training could have been pushed further leading to a bit better quality, and the training configuration was really lightweighted. Also, the dataset I've created for this model had some issues. I'm currently working on a new larger drums separation dataset without these issues.

With that better dataset and a config more focused on quality than on being lightweighted, we'll hopefully have some better model(s) in the coming months.

I'd be really glad to help you as much as I can even if I'm not a scientist, you can contact me at jm.jarredou@gmail.com

xavriley commented 1 week ago

Sorry for the long delay! I've just sent you an email with more details

xavriley commented 1 week ago

Here's a demo in case anyone is curious:

MIDI Transcription

Original Audio: https://www.dropbox.com/scl/fi/5b5yb74c8opdcekb2okql/MusicDelta_LatinJazz_Drum_stereo.wav?rlkey=getl31mjl8h82f69aoqh50dpp&dl=1

jarredou / models

SOTA drum transcription using this model #2