Open majo711 opened 1 year ago
Yes, it is possible!
However, it requires a lot of manual effort to obtain the duration of each phoneme, check its pitch, and shift it.
We are currently working on automating this process with the reconstructed Yingram from the Yingram decoder
Thanks for your sharing! If possible, how could I control it?