Camb-ai / MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI
https://www.camb.ai
GNU Affero General Public License v3.0
1.37k stars 95 forks source link

release weights as safetensors #18

Closed keturn closed 1 week ago

keturn commented 2 weeks ago

safetensors is a file format for tensor storage that is faster to load than PyTorch pickles and avoids the security risks of loading pickles.

It's true that if the code and the weights come from the same repo, one is just as safe as the other. But if there's any chance that you're going to end up with an ecosystem where various different models or fine-tunes are floating around, I think it best to set precedent for using safetensors from the start.

keturn commented 2 weeks ago

Implementation Notes

I'm not so familiar with torch.hub, but I guess it just runs whatever is in https://github.com/Camb-ai/MARS5-TTS/blob/master/hubconf.py ?

I don't think https://github.com/huggingface/safetensors comes with a load_from_url function, but it should be straightforward to use torch.hub.download_url_to_file followed by safetensors.torch.load_file or safetensors.torch.load_model.

RF5 commented 1 week ago

Hi @keturn , we're working on it and should have a safetensors release out within the next few days. THanks for the suggestion!

RF5 commented 1 week ago

Hi @keturn , we've implemented changes now. In the hub.load() call, you can now specify which kind of checkpoint you prefer loading (safetensors or .pt pytorch formats). By default, the torch.hub.load() now loads safetensors versions. Both versions of checkpoints are available under the releases tab.

So, whether you use .safetensors or .pt checkpoints, hopefully providing the option to pick which kind is desired makes it work for everyone. We've also updated the readme to reflect this.