yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
MIT License
4.78k stars 391 forks source link

feat: Improve model checkpoint loading #253

Open 5Hyeons opened 3 months ago

5Hyeons commented 3 months ago

Summary

This PR fixes the checkpoint loading issue in the second stage of training when using a single GPU. The second stage adds a 'module.' prefix to all parameter names, causing a mismatch with the first stage parameters.

Changes

Notes

Related Issue