idiap / coqui-ai-TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
https://coqui-tts.readthedocs.io
Mozilla Public License 2.0
341 stars 26 forks source link

[Bug] Potential vulnerability in TTS model loading due to torch.load default behavior #71

Closed ColeDrain closed 3 days ago

ColeDrain commented 1 month ago

Describe the bug

When using the TTS library, the following warning is displayed:

python3.11/site-packages/TTS/utils/io.py:54: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file.

This warning indicates a potential security vulnerability when loading models, as arbitrary code execution could occur during unpickling. I looked at the code, and it seems the default value is set to False, is there an implication on setting it to True.

To Reproduce

  1. Import and use the TTS library to load a model.
  2. Observe the warning message in the console output.

Expected behavior

The TTS library should use torch.load with weights_only=True by default to prevent potential security risks.

Logs

python3.11/site-packages/TTS/utils/io.py:54: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file.

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.4.0+cu121",
        "TTS": "0.24.1",
        "numpy": "1.26.4"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.11.9",
        "version": "#1 SMP Fri Mar 29 23:14:13 UTC 2024"
    }
}

Additional context

This issue is related to a known security concern in PyTorch. More details can be found in the PyTorch security documentation: https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models

eginhard commented 3 weeks ago

Thank you for opening this! I also noticed this and will look at it eventually, but anybody is also very welcome to submit a PR for this already. It should be a fairly simply change, just note that 2 occurrences of torch.load in the Trainer will need to be updated as well.

In the meantime it shouldn't be a huge issue - the list of included default models is fixed and custom models that users load will mostly be their own fine-tuned ones. Otherwise the link shared above is useful to be aware of the risks with untrusted models: https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models