mir-dataset-loaders / mirdata

Python library for working with Music Information Retrieval datasets
https://mirdata.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
351 stars 59 forks source link

Update MAESTRO Dataset loader from v2.0.0 to v3.0.0 #624

Open jwmao98 opened 4 months ago

jwmao98 commented 4 months ago

Hello! I'm trying to update upon the original dataset loader for MAESTRO, as the dataset currently has a v3.0.0 release but the mirdata loader only supports v2.0.0.

I've managed to (in a personal fork) update the file paths and checksum values on maestro.py using info from Zenodo (https://zenodo.org/records/4734828) but has been unable to find the checksum values for each individual audio/MIDI file to create a maestro_index_3.0.0.json metadata file under /indexes.

@rabitt Could you kindly look into this? Thank you so much!!

Laubeee commented 3 months ago

@jwmao98 can't you just run md5() on every file to get those checksums? you can use it from mirdata.validate (source)