Closed GoArsenal closed 1 year ago
As of batch size 12, not included silence utterance sample 42min, I was get to clear results in steps 4 epoch, 330K Step.
@SODAsoo07, hi! Have you trained on the VCTK dataset? Can you try generating a wav for "capital" (any voice)? Was the result good? Sounds out the whole word?
# utterance = 10000 # batch size = 16