NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.05k stars 2.51k forks source link

TypeError: Error instantiating 'nemo.collections.tts.torch.data.TTSDataset' #3676

Closed godspirit00 closed 2 years ago

godspirit00 commented 2 years ago

Describe the bug

When I was trying to train Fastpitch from scratch using fastpitch_align_v1.05.yaml config, after it spending over 1 hour trying to Loading dataset, it threw the error:

Traceback (most recent call last):
  File "examples/tts/fastpitch.py", line 28, in main
    model = FastPitchModel(cfg=cfg.model, trainer=trainer)
  File "/root/NeMo/nemo/collections/tts/models/fastpitch.py", line 81, in __init__
    super().__init__(cfg=cfg, trainer=trainer)
  File "/root/NeMo/nemo/core/classes/modelPT.py", line 138, in __init__
    self.setup_training_data(self._cfg.train_ds)
  File "/root/NeMo/nemo/collections/tts/models/fastpitch.py", line 453, in setup_training_data
    self._train_dl = self.__setup_dataloader_from_config(cfg)
  File "/root/NeMo/nemo/collections/tts/models/fastpitch.py", line 440, in __setup_dataloader_from_config
    dataset = instantiate(
  File "/root/miniconda3/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 180, in instantiate
    return instantiate_node(config, *args, recursive=_recursive_, convert=_convert_)
  File "/root/miniconda3/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 249, in instantiate_node
    return _call_target(_target_, *args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 64, in _call_target
    raise type(e)(
  File "/root/miniconda3/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 62, in _call_target
    return _target_(*args, **kwargs)
  File "/root/NeMo/nemo/collections/tts/torch/data.py", line 228, in __init__
    librosa.filters.mel(
TypeError: Error instantiating 'nemo.collections.tts.torch.data.TTSDataset' : mel() takes 0 positional arguments but 2 positional arguments (and 3 keyword-only arguments) were given

Steps/Code to reproduce bug

python examples/tts/fastpitch.py --config-name=fastpitch_align_v1.05.yaml train_dataset=/root/autodl-tmp/nancy_train.json validation_datasets=/root/autodl-tmp/nancy_val.json sup_data_path=/root/autodl-tmp/nemo-training/nancy_22k/sup_data exp_manager.exp_dir=/root/autodl-tmp/nemo-training/nancy_22k exp_manager.resume_if_exists=True exp_manager.resume_ignore_no_checkpoint=True model.train_ds.dataloader_params.batch_size=24 pitch_mean=199.37802124023438 pitch_std=55.59949493408203

Expected behavior

A clear and concise description of what you expected to happen.

Environment overview (please complete the following information)

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

Additional context

Add any other context about the problem here. Example: RTX 3090

Oktai15 commented 2 years ago

after it spending over 1 hour trying to Loading dataset

@godspirit00, you can pre-normalize your text and add in your manifests as "normalized_text" field for every json line. In this case, loading will be much faster (see as we did it for LJSpeech: https://github.com/NVIDIA/NeMo/blob/main/scripts/dataset_processing/tts/ljspeech/get_data.py#L101)

godspirit00 commented 2 years ago

@Oktai15 Thank you so much! I will give it a try right away.