sigsep / sigsep-mus-db

Python parser and tools for MUSDB18 Music Separation Dataset
https://sigsep.github.io/sigsep-mus-db/
MIT License
159 stars 33 forks source link

Validation tracks include test tracks when `subsets` is not specified #60

Closed faroit closed 3 years ago

faroit commented 4 years ago

In the readme, we state that:

musdb.DB(subsets='train', split='valid')

return the 14 validation tracks

However, when users do not specify subsets:

musdb.DB(split='valid')

the default is used (['train', 'test']) and 64 tracks are returned: 50+14. While the output is technically correct, I think for this case we should:

This issue might be critical since some users might already have been using musdb that way, so that validation tracks include test tracks.

@aliutkus @TE-StefanUhlich do you have some opinion how to address this?

faroit commented 3 years ago

ping @TE-StefanUhlich

StefanUhlich-sony commented 3 years ago

Actually, it would be more intuitive if there would be only a subsets argument which can take three different values:

But this is also somehow confusing as the original MUSDB18 only consists of two subsets (train and test). Hence, I would vote for raising a value error and say that split can only be set if split is train.

faroit commented 3 years ago

@TE-StefanUhlich okay, I also like this one best. will do that. thanks

faroit commented 3 years ago

Fixed in #75