coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
33.17k stars 4.01k forks source link

[Bug] Synthesis from monolingual local models fails #3449

Closed eginhard closed 7 months ago

eginhard commented 8 months ago

Describe the bug

Synthesis from monolingual local models fails because of missing config attribute.

To Reproduce

from TTS.api import TTS

cloud = TTS(model_name="tts_models/de/thorsten/vits")  # just to download the model
cloud.tts("test")  # this works fine

from TTS.utils.generic_utils import get_user_data_dir
model = os.path.join(get_user_data_dir("tts"), "tts_models--de--thorsten--vits", "model_file.pth")
config = os.path.join(get_user_data_dir("tts"), "tts_models--de--thorsten--vits", "config.json")
local = TTS(model_path=model, config_path=config)
_ = local.tts("test")

The last line results in the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../TTS/api.py", line 357, in tts
    self._check_arguments(
  File ".../TTS/api.py", line 253, in _check_arguments
    if self.is_multi_lingual and language is None:
  File ".../torch/nn/modules/module.py", line 1695, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'TTS' object has no attribute 'is_multi_lingual'

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.1+cu121",
        "TTS": "0.21.3",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.10.11",
        "version": "#38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov  2 18:01:13 UTC 2"
    }
}

Additional context

The error message is somewhat confusing because is_multi_lingual is defined, but one variable used in that property function is None in certain cases. More details: https://github.com/pytorch/pytorch/issues/13981

stale[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

chibiskuld commented 6 months ago

This still happening. v0.22.0

self.config.languages is calling in a circular fashion so len(self.config.languages) > 1 on line 102 fails.

the method should be:

    @property
    def is_multi_lingual(self):
        # Not sure what sets this to None, but applied a fix to prevent crashing.
        if hasattr(self.synthesizer.tts_model, "language_manager") and self.synthesizer.tts_model.language_manager:
            return self.synthesizer.tts_model.language_manager.num_languages > 1
        if (
            isinstance(self.model_name, str)
            and "xtts" in self.model_name
            or self.config
            and "xtts" in self.config.model
        ):
            return True
        return False
eginhard commented 4 months ago

It's fixed in our fork: https://github.com/idiap/coqui-ai-TTS