Jungjee / RawNet

Official repository for RawNet, RawNet2, and RawNet3
MIT License
352 stars 55 forks source link

Document the DB directory structure #7

Closed turian closed 4 years ago

turian commented 4 years ago

For people who don't want to use VoxCeleb + VoxCeleb2, it is hard to figure out what the directory structure should be for DB. Could you please document it?

Or even nicer, if there were a simple to download audio dataset (e.g. from torchaudio) that the script would lay out in the right way, people could immediately try your repo and see if it works on their GPU.

turian commented 4 years ago

Duplicate of https://github.com/Jungjee/RawNet/issues/5

turian commented 4 years ago

I am re-opening this because now train.py wants DB/wav, DB/eval_wav, and DB/wav

Could you please explain what the directory structure should be? I have different WAV files I want to train

Jungjee commented 4 years ago

As of default, the script is written to read all files with "wav" extension under 'DB/VoxCeleb2/wav' for training, and 'DB/VoxCeleb1/eval_wav/' for speaker embedding extraction in the test phase.

Put your dataset under aforementioned directory, or give "DB", "DB_vox2", "dev_wav" as arguments when running scripts :) In the code, PyTorch Dataset will use args.DB_vox2+args.dev_wav as 'self.base_dir' and read utterances.

turian commented 4 years ago

So there are still some details that I am missing, and unfortunately I am having difficulty understanding the code.

It's more than all WAV files under DB/VoxCeleb2/wav for training and DB/VoxCeleb1/eval_wav/, it seems like you need one subdirectory per speaker? (I'm not doing speaker identification.) What goes in DB/VoxCeleb1/veri_test.txt and DB/VoxCeleb1/val_trial.txt?

Here's what I tried just to create a rough directory structure just to get the code running

!mkdir -p DB/VoxCeleb2/wav/everyone
!mkdir -p DB/VoxCeleb1/eval_wav/everyone
!cp -R train-small/* DB/VoxCeleb2/wav/everyone
!cp -R train-small/* DB/VoxCeleb1/eval_wav/everyone
!find DB/VoxCeleb1/eval_wav/ -name \*.wav > DB/VoxCeleb1/val_trial.txt
!find DB/VoxCeleb1/eval_wav/ -name \*.wav > DB/VoxCeleb1/veri_test.txt

But that still doesn't work, it crashes with

  File "/usr/local/lib/python3.6/dist-packages/soundfile.py", line 1357, in _error_check
    raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'DB/VoxCeleb2/wav/wav/everyone/soundsofsocrates - leads - lead-e VARIATION-002-019.wav': System error.

So it's constructing the directory paths wrong somehow.

I set up a little Google Colab notebook.

Or if you could run:

find DB

to show your directory structure and also the contents of DB/VoxCeleb1/veri_test.txt and DB/VoxCeleb1/val_trial.txt that would be great.

turian commented 4 years ago

What would be most helpful would be a simple Google Colab notebook that demonstrates how to set up the data and run the code :) RawNet2 is a very cool work based upon my reading of it, I am just struggling with the code because I want to try it for a different dataset and also because VoxCeleb is hard to get and the directory structure is not well-documented on their webpage or this code :(

Jungjee commented 4 years ago

I have added filetrees. You can discard val_trial.txt as it is unofficial (I just used it for model validation) and veri_test.txt is in "trials" folder.

Jungjee commented 4 years ago

I agree that adding documentation and making code available for other datasets will definitely increase readability for other domain researchers : ) However, I'm not sure if I can do it right now.. I'll update the codes ASAP :)

Jungjee commented 4 years ago

In the meantime, revising get_utt_list in utils.py and Dataset class's self.base_dir and y = self.labels[ID.split('/')[0]] (line 52 of dataloaders.py), I suppose you can train using your dataset :)