Closed Sleepwalking closed 6 years ago
Interesting, did you notice a perceivable difference in the vocoded speech too?
@m-toman Absolute discrimination? No. Differential discrimination? Barely able to tell. This is really just a 0.67 dB gain on the unvoiced part. For some reason WORLD vocoder makes speech sound more breathy to me and I started looking for why. In the process of doing so this bug was found but it seems only tangentially relevant to the breathy voice problem. My current bet is on the rectangular-windowed STFT for noise synthesis. Nevertheless bug fixes are always good to have, for sanity's sake.
Thank you for your comment and discussion. I'm in a business trip, so I will check it after a few days.
perceivable difference
I also perceived that difference, but I cannot judge this difference is caused by the bug. In particular, the difference was perceived in plosive. I have thought that the difference was caused by the random seed.
I checked your request and accept it. Thank you for your cooperation.
Since I am going to close this thread, please make a new thread if needed. I think that noise perception is an interesting but difficult topic.
https://github.com/mmorise/World/blob/master/src/matlabfunctions.cpp#L242
randn
function implements an iterated xorshift random number generator that generates an approximately normal-distributed random variable (RV) by summing 12 uniform RVs in [0, 2^32 / 16].Due to an incomplete initial iteration, it was empirically found that the values for w at the 2nd and 4th iterations will be nearly identical, breaking the i.i.d. assumption. The output RV will have a variance of (10 + 4) / 12 = 1.1667. This can be verified by feeding the vocoder a Gaussian white noise and measuring the variance of the output.
The bug has been fixed by putting back the initial iteration on w. In addition, all occurrences of
unsigned int
have been replaced byuint32_t
for good portability.