facebookresearch / demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation
MIT License
8.33k stars 1.06k forks source link

Training and evaluating a vocal model only #221

Open Anjok07 opened 2 years ago

Anjok07 commented 2 years ago

❓ Questions

I'd love to train a model from scratch using our own dataset. It's already formatted appropriately and we've already trained an incredibly strong onnx vocal model using KUIELAB-MDX-Net code.

Is it possible to train an individual stem model in Demucs V3?

adefossez commented 2 years ago

So it is possible, but with a bit of hacking. Comment the following line so that the musdb dataset is not loaded: https://github.com/facebookresearch/demucs/blob/main/demucs/train.py#L80

Create a folder with train and valid subfolders and in each put one folder per track, with three files: vocals.wav, non_vocals.wav, mixture.wav.

Then create a file in conf/dset/my_dset.yaml and change the paths (like https://github.com/facebookresearch/demucs/blob/main/conf/dset/extra44.yaml), and add a line to have something like:

dset:
  wav: ...
  ...
  sources: ['vocals', 'non_vocals']

You might want to deactivate this if statement to skip evaluation on musdb: https://github.com/facebookresearch/demucs/blob/main/demucs/solver.py#L266

Then finally, you can run an experiment with dora run -d dset=my_dset.

Anjok07 commented 2 years ago

Thank you! I got it running now! I'll let you know if I run into any issues 😀👍

K3nn3th2 commented 2 years ago

Thank you! I got it running now! I'll let you know if I run into any issues grinning+1

Hey @Anjok07, did you manage to make your model more robust? Also, is it a 2 stem model? as in, does it yoield an accompaniment also?

I have tried training the spleeter model with some 2 stem data, but didnt seem to get better results .. partly even worse. i realised it boils down to what you train it with. if the data isnt clean, you cant expect clean results

edit: i see your model is also featured in the google collab list. nice!

johndpope commented 2 years ago

Sorry - I'm stuck on commenting out line 80

# train_set, valid_set = get_musdb_wav_datasets(args.dset)

this defines the train_set / valid_set varaibles which are used in subsequent lines. Are you saying these variables should be set to null ? or ..???

I'm looking to use 5 stems - all of the dset parameters in conf are set to 2 channels.

I mentioned this in the other ticket referenced above - but I’ve been using spleeter for years and the training config is very straightforward - some sample config files

https://github.com/deezer/spleeter/blob/master/configs/musdb_config.json

just point to a training csv https://github.com/deezer/spleeter/blob/master/configs/musdb_train.csv

And corresponding validation csv https://github.com/deezer/spleeter/blob/master/configs/musdb_validation.csv

gonna have to revert back to spleeter instead. Please provide simple custom training examples in the repo which don't require hacks.

mr-segfault commented 2 years ago

So it is possible, but with a bit of hacking. Comment the following line so that the musdb dataset is not loaded: https://github.com/facebookresearch/demucs/blob/main/demucs/train.py#L80

I can generate my own songs (full instrumentals, N-numbers of channels) rather quickly, if I had more than two types of stems, would I modify this: sources: ['vocals', 'non_vocals'] to be something similar to sources: ['vocals', 'guitar', 'piano', 'flute', 'drum', 'bass'] etc?

Anjok07 commented 2 years ago

So it is possible, but with a bit of hacking. Comment the following line so that the musdb dataset is not loaded: https://github.com/facebookresearch/demucs/blob/main/demucs/train.py#L80

Create a folder with train and valid subfolders and in each put one folder per track, with three files: vocals.wav, non_vocals.wav, mixture.wav.

Then create a file in conf/dset/my_dset.yaml and change the paths (like https://github.com/facebookresearch/demucs/blob/main/conf/dset/extra44.yaml), and add a line to have something like:

dset:
  wav: ...
  ...
  sources: ['vocals', 'non_vocals']

You might want to deactivate this if statement to skip evaluation on musdb: https://github.com/facebookresearch/demucs/blob/main/demucs/solver.py#L266

Then finally, you can run an experiment with dora run -d dset=my_dset.

Commenting out line 80 led to variable errors.

Here's what I did instead:

Change line 80 in https://github.com/facebookresearch/demucs/blob/main/demucs/train.py#L80 from

train_set, valid_set = get_musdb_wav_datasets(args.dset)

To

train_set, valid_set = get_wav_datasets(args.dset)

This way it sees my referenced dataset as the main one.

Anjok07 commented 2 years ago

@K3nn3th2 K3nn3th2 After much trial and error, I'm finally getting robust results using our big dataset! I'll post updates when training has been completed.

jie-chen commented 8 months ago

So it is possible, but with a bit of hacking. Comment the following line so that the musdb dataset is not loaded: https://github.com/facebookresearch/demucs/blob/main/demucs/train.py#L80

Create a folder with train and valid subfolders and in each put one folder per track, with three files: vocals.wav, non_vocals.wav, mixture.wav.

Then create a file in conf/dset/my_dset.yaml and change the paths (like https://github.com/facebookresearch/demucs/blob/main/conf/dset/extra44.yaml), and add a line to have something like:

dset:
  wav: ...
  ...
  sources: ['vocals', 'non_vocals']

You might want to deactivate this if statement to skip evaluation on musdb: https://github.com/facebookresearch/demucs/blob/main/demucs/solver.py#L266

Then finally, you can run an experiment with dora run -d dset=my_dset.

It seems the training wavset use sum of sources instead of mixture file as shown here https://github.com/facebookresearch/demucs/blob/main/demucs/solver.py#L308 . Does it make a difference if mixture file is used instead?

person-29 commented 2 months ago

So it is possible, but with a bit of hacking. Comment the following line so that the musdb dataset is not loaded: https://github.com/facebookresearch/demucs/blob/main/demucs/train.py#L80

Create a folder with train and valid subfolders and in each put one folder per track, with three files: vocals.wav, non_vocals.wav, mixture.wav.

Then create a file in conf/dset/my_dset.yaml and change the paths (like https://github.com/facebookresearch/demucs/blob/main/conf/dset/extra44.yaml), and add a line to have something like:

dset:
  wav: ...
  ...
  sources: ['vocals', 'non_vocals']

You might want to deactivate this if statement to skip evaluation on musdb: https://github.com/facebookresearch/demucs/blob/main/demucs/solver.py#L266

Then finally, you can run an experiment with dora run -d dset=my_dset.

Is it necessary for the sources to be called 'vocals' and 'non_vocals' or would any other name work?

CarlGao4 commented 2 months ago

You can use any name except "mixture"