Closed InnuendoB closed 3 years ago
Thank you for your question.
The variable x_length is the signal length of the input speech signal. Functions Dio() and Harvest() first carry out the filtering by the convolution. We can rapidly convolve the signal by using FFT, and this convolution requires the whole signal length. The x_length is used to calculate the convolution.
Thank you for your reply!
Hi, I got confused at the meaning of x_length. Since the fft_size of Dio and Harvest is larger than x_length, described in GetSuitableFFTSize, I think it should be the length of each frame? Otherwise fft_size would be overwhelming.