Rikorose / DeepFilterNet

Noise supression using deep filtering
https://huggingface.co/spaces/hshr/DeepFilterNet2
Other
2.15k stars 198 forks source link

DF dataset error: NoDatasetFoundError("Could not found any noise datasets.") #518

Closed viki347 closed 5 days ago

viki347 commented 4 months ago

Thanks for your awesome work.Apologies for being a novice in the field of deep learning. At the moment I'm not very good at adapting, I'll just follow the instructions and mimic them step by step from the README prompts. It's my first time replicating code, and I've encountered an issue related to the dataset.

I've configured my environment according to the README, piped the appropriate python packages, run prepare_data.py, and successfully generated the .hdf5 datas, so it stands to reason that my environment is fine. Encountered the following issue while running train.py. $ python df/train.py -h

$ python df/train.py -h
usage: train.py [-h] [--host-batchsize-config HOST_BATCHSIZE_CONFIG] [--no-resume] [--log-level LOG_LEVEL] [--debug] [--no-debug] data_config_file data_dir base_dir

positional arguments:
  data_config_file      Path to a dataset config file.
  data_dir              Path to the dataset directory containing .hdf5 files.
  base_dir              Directory to store logs, summaries, checkpoints, etc.

options:
  -h, --help            show this help message and exit
  --host-batchsize-config HOST_BATCHSIZE_CONFIG, -b HOST_BATCHSIZE_CONFIG
                        Path to a host specific batch size config.
  --no-resume
  --log-level LOG_LEVEL
                        Logger verbosity. Can be one of (trace, debug, info, error, none)
  --debug
  --no-debug

This proves that my code is working.

But ,I have some problems while running python df/train.py train-data/dataset.cfg train-data/testDataset/ train-data/test_base_dir/.

my datase.cfg like:

{
  "train": [
    [
      "TRAIN_SET_SPEECH.hdf5",
      1.0
    ],
    [
      "TRAIN_SET_NOISE.hdf5",
      1.0
    ],
    [
      "TRAIN_SET_RIR.hdf5",
      1.0
    ]
  ],
  "valid": [
    [
      "VALID_SET_SPEECH.hdf5",
      1.0
    ],
    [
      "VALID_SET_NOISE.hdf5",
      1.0
    ],
    [
      "VALID_SET_RIR.hdf5",
      1.0
    ]
  ],
  "test": [
    [
      "TEST_SET_SPEECH.hdf5",
      1.0
    ],
    [
      "TEST_SET_NOISE.hdf5",
      1.0
    ],
    [
      "TEST_SET_RIR.hdf5",
      1.0
    ]
  ]
}

Here are the path and contents of my dataset:

[train-data/testDataset/]
   -TEST_SET_NOISE.hdf5
   -TEST_SET_RIR.hdf5
   -TEST_SET_SPEECH.hdf5
   -TRAIN_SET_NOISE.hdf5
   -TRAIN_SET_RIR.hdf5
   -TRAIN_SET_SPEECH.hdf5
   -VALID_SET_NOISE.hdf5
   -VALID_SET_RIR.hdf5
   -VALID_SET_SPEECH.hdf5
$ python df/train.py train-data/dataset.cfg train-data/testDataset/ train-data/test_base_dir/
2024-02-29 15:07:38 | INFO     | DF | Running on torch 2.2.0+cu121
2024-02-29 15:07:38 | INFO     | DF | Running on host workstation
Unknown option: -C
usage: git [--version] [--help] [-c name=value]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p|--paginate|--no-pager] [--no-replace-objects] [--bare]
           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
           <command> [<args>]
2024-02-29 15:07:38 | INFO     | DF | Loading model settings of test_base_dir
2024-02-29 15:07:38 | INFO     | DF | Running on device cpu
2024-02-29 15:07:38 | INFO     | DF | Initializing model `deepfilternet3`
2024-02-29 15:07:38 | INFO     | DF | Initializing dataloader with data directory train-data/testDataset/
train-data/testDataset/ /n 111111
2024-02-29 15:07:38 | ERROR    | DF | An error has been caught in function '<module>', process 'MainProcess' (53886), thread 'MainThread' (22468762900288):
Traceback (most recent call last):

> File "/home/Project/DeepFilterNet-main/DeepFilterNet/df/train.py", line 625, in <module>
    main()
    └ <function main at 0x146e99f7eb00>

  File "/home/Project/DeepFilterNet-main/DeepFilterNet/df/train.py", line 139, in main
    dataloader = DataLoader(
                 └ <class 'libdfdata.torch_dataloader.PytorchDataLoader'>

  File "/home/Project/DeepFilterNet-main/pyDF-data/libdfdata/torch_dataloader.py", line 105, in __init__
    self.loader = _FdDataLoader(
    │             └ <class 'builtins._FdDataLoader'>
    └ <libdfdata.torch_dataloader.PytorchDataLoader object at 0x146e996f16c0>

RuntimeError: DF dataset error: NoDatasetFoundError("Could not found any noise datasets.")

Here shows that "Could not found any noise datasets.", but my dataset path has the file"TRAIN_SET_NOISE.hdf5" indeed.

I'm guessing the problem comes from the address where my data is stored. Can you share your directory and format for storing hdf5 data?

Any suggestions on how to resolve this? I would be grateful for your help.

github-actions[bot] commented 1 week ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.