Closed WoodieDudy closed 1 month ago
I made a guide how to prepare dataset and train model
https://github.com/WoodieDudy/open-source-stuff/tree/main/DeepFilerNet/prepare_dataset
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.
Hello @Rikorose,
I'm working on reproducing the training results for DeepFilterNet3 and have some questions regarding the dataset configuration to specify it in the
dataset.cfg
file. The paper (https://arxiv.org/abs/2305.08227) mentions the use of the DNS4 dataset, along with oversampled PTDB and VCTK for training.I found a
scripts/download_process_dns4.sh
script for downloading DNS4 and, after running it, ended up with files similar to the following:So I need to fill
dataset.cfg
with these downloaded filesFrom my understanding, all files with the suffix "_TRAIN" should be placed in the "train" section of the
dataset.cfg
. Should theSLR26_TRAIN.hdf5
andSLR28_TRAIN.hdf5
files be placed in the RIR category? Additionally, there's only one "_VALID" file; is this expected, and should it be placed in the "noise" section? Where can I find "speech" and "rir" files for "valid" key indataset.cfg
?I could not find scripts for downloading PTDB and VCTK, so I downloaded them manually: PTDB
VCTK
The PTDB dataset includes audio files with kinda noise (laryngograph signal); should these be excluded, focusing only on clean speech?
The paper also mentions a VCTK/DEMAND test set.
Which looks like this
I found it have both noisy and clean audio. How should these be properly passed into the
prepare_data.py
script to make .hdf5 files fordataset.cfg
?Thank you for your guidance.