Closed renerocksai closed 4 years ago
Sorry, I can't check the program until Thursday. I will do this work this weekend. Since, it is a little strange that nan/-nan values are observed, I'd like to check the cause.
It might help that I used the analysis
example tool from merlin. Feeding it a 50s WAV of synthesized speech containing "stuttering" caused analysis
to dump core.
Further analysis (no pun intended) revealed, that using the current analysis
tool from examples/analysis_synthesis
, no core is dumped with or without isnan()
.
However, in any case when using the current analysis
tool on bespoke WAV, then SPTK's x2x
tool compains when converting the .lf0
file:
x2x : error: input data is over the range of type 'float'!
I don't know what to make of that but one could argue, this is a Merlin specific problem, too, since the use of analysis
in combination with x2x
is part of Merlin's scripts: extract_features_for_merlin.sh.
So, potentially, -nan values will not occur with the current version, however, that over the range of float error rises suspicions.
I conclude that the isnan()
check better goes into Merlin's repository which contains an old copy of WORLD.
Thank you for your information. I think that the duration of the waveform does not cause the error.
We can observe the error in x2x when the absolute amplitude is over FLT_MAX. However, if this is the cause of error, we should check the amplitude of waveform before analyzing. Latest version has another error checker, and it seems that the isnan() check is redundant. If you have an example of .wav file that causes the error, please give me it. I'll check it and debug the program with your idea if needed.
Thanks for looking into this and the additional information. I didn't mean the duration per se, I just wanted to give an indication about some properties of the waveform; more precisely it is 1 German sentence from a children's book of synthesized speech, where the last part of the sentence unintentionally gets repeated a lot, until it's 50s long. By mere listening to it, I haven't found anything weird regarding the amplitude.
I have uploaded the file, you can download it from here: 0013.zip
However, plotting the waveform revealed (I used TwistedWave Online Audio Editor since I am on a chromebook) that it is maxed out at 7397.xx ms:
This suggests that the amplitude exceeds FLT_MAX, and hence an amplitude check would be a good idea.
Thank you for your information and the attached example.
I used the test program test.cpp in the latest version, but I could not confirm the error. If you changed the option, please give me the setting for tracing the error.
Thus rejecting nan/-nan tentative_f0 values. Got these on large (50s) WAVs of synthesized speech.