Closed rppravin closed 3 years ago
Thanks for the code!
In the default settings of the code, training uses ~2 sec segments (with 1:16 downsampling at the bottle neck layer).
Is it possible to modify the code to get voice conversion working for ~120 ms segments? Would zero padding work?
Thanks in advance, Pravin
In this case, you don't need to modify the code. Just pad your segments to the nearest multiple of 16 frames.
Thanks for the code!
In the default settings of the code, training uses ~2 sec segments (with 1:16 downsampling at the bottle neck layer).
Is it possible to modify the code to get voice conversion working for ~120 ms segments? Would zero padding work?
Thanks in advance, Pravin