Closed shigabeev closed 2 years ago
Can you check this @Edresson ?
Hi @shigabeev,
Looks like you are trying to do inference providing d-vectors (speaker embeddings) on a model that wasn't trained with external d-vectors.
Your training config need to have use_d_vector_file = True and you should provide the "d_vector_file" as well.
Here a short tutorial on how to generate the "d_vector_file": https://github.com/Edresson/YourTTS#reproducibility
I close this until a further flag on this.
Describe the bug
Hey! I'm trying to run models with speaker consistency loss and the inference doesn't run:
It returns the same error as in #1457
AttributeError: 'StochasticDurationPredictor' object has no attribute 'cond'
Fun fact: defining d_vector_dim doesn't help anymore.
When I change config to:
it allows to partially load weights of model with
strict=false
:model.load_state_dict(model_weights, strict=False)
But the inference is of understandably terrible quality since some layers didn't load and contain pure noise.Another option is to change config with:
This way, weights will load without raising error, but since there's no trace of speaker encoder left in model, it would raise the same error as in #1457
AttributeError: 'StochasticDurationPredictor' object has no attribute 'cond'
Related Issues:
2059 - if use_speaker_encoder_as_loss set to True, model is either able to train itself, either to run inference. But I never seen a config that allows for both. Even an original YourTTS one.
To Reproduce
Steps to reproduce:
Expected behavior
Inference runs without any modifications of the config
Logs
No response
Environment
Additional context
My model config: