Using musdb for my own dataset

sigsep / sigsep-mus-db

Python parser and tools for MUSDB18 Music Separation Dataset

https://sigsep.github.io/sigsep-mus-db/

MIT License

159 stars 33 forks source link

Using musdb for my own dataset #58

Closed imflash217 closed 5 years ago

imflash217 commented 5 years ago

Hi, I have trained a Neural Network using musdb. Now I want to do transfer learning on some more data that I have collected separately. Is it possible to use musdb API on my own dataset if I keep all my tracks in a similar fashion as musdb database?

Please help. Thanks

faroit commented 5 years ago

Hi, thanks for your interest.

I have two things to tell you about this issue:

A) yes, you can use musdb for your own tracks if you do not use the new (builtin) train/validation split B) if you use pytorch, stay tuned, we release https://open.unmix.app/ very soon in next two weeks that include flexible data loaders that could combine two or more datasets

imflash217 commented 5 years ago

Hi @faroit, Thank you for the info. I use PyTorch :smile: and I'm looking forward to the unmix.app.

I see that musdb needs STEMS file format (.stems.mp4) but once we create a musdb.DB() it separates all the stems into different files. So, is it possible (&how) to skip creating the .stems.mp4 files and instead keep the original STEMS in the train , test directory. How will this work?

Thanks

faroit commented 5 years ago

see that musdb needs STEMS file format (.stems.mp4) but once we create a musdb.DB() it separates all the stems into different files

no, it just parses the stems and reads from them. There is conversion tool to convert to separated wav files if you want this. So you can just create a folder called your_dataset/train/track1 and musdb will find it it, if you use the --root your_dataset.

I would highly recommend to use wav or flac for training a DNN, since the STEMS is quite slow to decode.

In any case, if you use pytorch you really want to wait for our release :-)

imflash217 commented 5 years ago

Thanks @faroit . Will wait for the release.

no, it just parses the stems and reads from them. There is conversion tool to convert to separated wav files if you want this. So you can just create a folder called your_dataset/train/track1 and musdb will find it it, if you use the --root your_dataset.

I have all my files in wav format, so as you said, I should be able to create a musdb.DB(root_dir="path/to/my/dir/with/all/wav-files/") object?? How to do it? I tried to create musdb.DB() object like above but it doesn't read the wav files when I use musdb.DB().load_mus_tracks()

Thanks

faroit commented 5 years ago

Just additionally pass the is_wav=True flag. See documentation here https://sigsep.github.io/sigsep-mus-db/#musdb.DB

imflash217 commented 5 years ago

Thanks @faroit :smiley: