mir-dataset-loaders / mirdata

Python library for working with Music Information Retrieval datasets
https://mirdata.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
365 stars 58 forks source link

Add 4MuLA #427

Open AngeloMendes opened 3 years ago

AngeloMendes commented 3 years ago

I'd like to add 4MuLA: A Multitask, Multimodal, and Multilingual Dataset of Music Lyrics and Audio Features to mirdata.

The tiny and small versions are single files, simples to add. The full version has ~300GB and is split into several files, but I can build the loader too.

magdalenafuentes commented 3 years ago

Hey Angelo, that would be great!

I took a quick look, the dataset has downloadable features right? I think it should be straightforward to implement with the tools we have right now. I'm happy to help with the download function, I think we could customize it so users can choose what version of the dataset to download? e.g. it downloads the 2GB or 11GB version by default unless the user choses something else?

Feel free to open a PR following the contributing guidelines!

AngeloMendes commented 3 years ago

Hey @magdalenafuentes! Yes, the features are already contained in the dataset and ready to use. I also think that the 2 GB and 11 GB versions will be easy to incorporate. And I'm happy to help and I'm going to work on PR to add the full version.

PRamoneda commented 3 years ago

If you have any questions, do not hesitate to write here! Even for a quick call! We are a very active community!

magdalenafuentes commented 3 years ago

@AngeloMendes let me know if you need any help getting started with this, I'm happy to help!

AngeloMendes commented 3 years ago

Hey @magdalenafuentes! I did the fork, and I'll start in loader development. I will keep you updated, and I will not hesitate to ask you for help. :-)

AngeloMendes commented 3 years ago

Hi @magdalenafuentes and @PRamoneda! I open the PR #480 with the tiny version of my dataset added. I'll work to add the small version, but I think that your revisions will help in this. :-)