open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.45k stars 381 forks source link

[BUG]: Unable to run stage 1 with FastSpeech2 #111

Closed caigun closed 8 months ago

caigun commented 8 months ago

Describe the bug

I followed the tutorials for the example recipe of FastSpeech2 and didn't pass the first stage. This problem also occurs on my Windows laptop.

How To Reproduce

Steps to reproduce the behavior:

  1. Config/File changes: Only the local path of the dataset
  2. Run command: sh egs/tts/FastSpeech2/run.sh --stage 1

Expected behavior

Data Preparation failed and was interrupted.

Screenshots

See error: (Amphion) harrywang@Harrys-MacBook-Air Amphion % sh egs/tts/FastSpeech2/run.sh --stage 1 /Users/harrywang/Amphion/mfa Exprimental Configuration File: /Users/harrywang/Amphion/egs/tts/FastSpeech2/exp_config.json Preprocess LJSpeech... Prepare alignment LJSpeech... 0it [00:00, ?it/s] Traceback (most recent call last): File "/Users/harrywang/Amphion/bins/tts/preprocess.py", line 244, in main() File "/Users/harrywang/Amphion/bins/tts/preprocess.py", line 240, in main preprocess(cfg, args) File "/Users/harrywang/Amphion/bins/tts/preprocess.py", line 112, in preprocess prepare_align( File "/Users/harrywang/Amphion/preprocessors/processor.py", line 104, in prepare_align ljspeech.prepare_align(dataset, dataset_path, cfg, output_path) File "/Users/harrywang/Amphion/preprocessors/ljspeech.py", line 139, in preparealign wav, = librosa.load(wav_path, sampling_rate) TypeError: load() takes 1 positional argument but 2 were given

Environment Information

lmxue commented 8 months ago

Hi, what version of librosa are you using?

caigun commented 8 months ago

0.10.1

lmxue commented 8 months ago

Would you please give the log of the wav_path and sampling_rate in your experiment?

caigun commented 8 months ago

Thanks~ Problem solved: Amphion/preprocessors/ljspeech.py, line 139, in prepare_align: add a "sr=" before "samplingrate": wav, = librosa.load(wav_path, sr=sampling_rate)