Thanks for noticing Better Diffusion Modeling Technology. Recently, Xue et al. proposed that Multi-GradSpeech using Consistent Diffusion Model as the generative network outperforms Grad-TTS in both single- and multi-speaker scenarios, and I believe that this advantage can be carried over to the SVC task, and I'd be happy to share the code if you'd like to try to replace Grad-TTS with Multi-GradSpeech.
Thanks for noticing Better Diffusion Modeling Technology. Recently, Xue et al. proposed that Multi-GradSpeech using Consistent Diffusion Model as the generative network outperforms Grad-TTS in both single- and multi-speaker scenarios, and I believe that this advantage can be carried over to the SVC task, and I'd be happy to share the code if you'd like to try to replace Grad-TTS with Multi-GradSpeech.