Unable to use self trained vocoder for synthesize

marctessier commented 10 months ago

Error when trying to synthesize using a self trained vocoder while using the latest "main" branch.

NOTE, I am able to synthesize when using our 2.5M checkpoint see below for that example.

everyvoice synthesize text-to-wav -t "This is a test" --vocoder-path ./logs_and_checkpoints/VocoderExperiment/base/checkpoints/last.ckpt logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt

/home/tes001/u/TxT2SPEECH/miniconda3_u20/envs/EveryVoice/lib/python3.9/site-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") 2023-12-05 10:08:27.156 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:400 - Loading checkpoint from logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt 2023-12-05 10:08:28.869 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:418 - Processing text 'This is a test' 2023-12-05 10:08:28.870 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:423 - Creating batch 2023-12-05 10:08:28.870 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:438 - Predicting spectral features 2023-12-05 10:08:29.496 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:473 - Loading Vocoder from logs_and_checkpoints/VocoderExperiment/base/checkpoints/last.ckpt 2023-12-05 10:08:30.318 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:477 - Generating waveform... /home/tes001/u/TxT2SPEECH/miniconda3_u20/envs/EveryVoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /gpfs/fs5/nrc/nrc-fs1/ict/others/u/tes001/TxT2SPEECH/EveryVoice/everyvoice/model/feature_predict │ │ ion/FastSpeech2_lightning/fs2/cli.py:478 in synthesize │ │ │ │ 475 │ │ │ │ ) │ │ 476 │ │ │ │ ckpt = torch.load(model.config.training.vocoder_path, map_location=devic │ │ 477 │ │ │ │ logger.info("Generating waveform...") │ │ ❱ 478 │ │ │ │ wav, sr = synthesize_data(spec, ckpt) │ │ 479 │ │ │ │ logger.info(f"Writing file {data_path}") │ │ 480 │ │ │ │ write(f"{data_path}.wav", sr, wav) │ │ 481 │ │ if "npy" in output_type: │ │ │ │ /gpfs/fs5/nrc/nrc-fs1/ict/others/u/tes001/TxT2SPEECH/EveryVoice/everyvoice/model/vocoder/HiFiGAN │ │ _iSTFT_lightning/hfgl/utils.py:26 in synthesize_data │ │ │ │ 23 │ model.load_state_dict(generator_ckpt["state_dict"]) │ │ 24 │ model.generator.eval() │ │ 25 │ model.generator.remove_weight_norm() │ │ ❱ 26 │ if config.model.istft_layer: │ │ 27 │ │ inverse_spectral_transform = get_spectral_transform( │ │ 28 │ │ │ "istft", │ │ 29 │ │ │ model.generator.post_n_fft, │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ AttributeError: 'dict' object has no attribute 'model'

EXAMPLE below working using the 2.5M vocoder...

everyvoice synthesize text-to-wav -t "This is a test" --vocoder-path ../../MODELS/2.5M.ckpt logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt /home/tes001/u/TxT2SPEECH/miniconda3_u20/envs/EveryVoice/lib/python3.9/site-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") 2023-12-05 10:10:42.191 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:400 - Loading checkpoint from logs_and_checkpoints/FeaturePredictionExperiment/base/checkpoints/last.ckpt 2023-12-05 10:10:43.937 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:418 - Processing text 'This is a test' 2023-12-05 10:10:43.938 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:423 - Creating batch 2023-12-05 10:10:43.938 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:438 - Predicting spectral features 2023-12-05 10:10:44.596 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:473 - Loading Vocoder from ../../MODELS/2.5M.ckpt 2023-12-05 10:10:45.306 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:477 - Generating waveform... /home/tes001/u/TxT2SPEECH/miniconda3_u20/envs/EveryVoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") 2023-12-05 10:10:50.496 | INFO | everyvoice.model.feature_prediction.FastSpeech2_lightning.fs2.cli:synthesize:479 - Writing file synthesis_output/Thisisates

roedoejet commented 9 months ago

@SamuelLarkin - can you look into this while you're working on the synthesize command please? Thanks!

roedoejet commented 8 months ago

this is fixed....as of...some time....sorry to not be more specific

EveryVoiceTTS / EveryVoice

Unable to use self trained vocoder for synthesize #190