Open CasonTsai opened 1 month ago
You may use the attn
output from the forward
method as the phoneme-audio alignment result
You may use the
attn
output from theforward
method as the phoneme-audio alignment result
thanks for replying,i will experience in inference
You may use the
attn
output from theforward
method as the phoneme-audio alignment result
hello,i print the attn output in inferncing the model ,but I don’t know the correspondence between phoneme duration time and attn output of text,thank you for your reply
amazing work! excuse me, how to extract text'phoneme duration time form StochasticDurationPredictor or DurationPredictor ? I want to extract the delay time of the phoneme corresponding to each piece of text。