Open cushycrux opened 3 months ago
Could be interesting. I've tried several fast voice cloning models (a few minutes audio), and none were very good.
Also, Piper is just a wrapper on VITS, and I'm not sure I like that level of abstraction. I was thinking more about a more minimal wrapper on VITS, as I have around whisper and llama.
@cushycrux have you tried that approach?
ps. piper training is Linux only but works in Windows 10/11 WSL (https://learn.microsoft.com/en-us/windows/wsl/install) Poweshell:
wsl --install
Thoughts?