Closed vaxt closed 5 years ago
General vocoder-based system including WORLD cannot analyze the singing voice containing music. The system requires the clean singing without music or noise.
Thanks for replying. The output of those audio without music are better. But even clear voice sometime has bad result. Since the pitch is high,I set the f0_ceil in harvest up to 1100.That will make the output a little bit better. Is there anything i can do to improve the result? I upload a new sample file below, i think that should be clean enough.
new sample file: wa.zip
I confirmed that we cannot estimate the F0 contour from the speech. You can solve this problem by using Dio() and StoneMask() with appropriate options. In Dio(), please set f0_ceil and f0_floor to 1200 and 400, respectively.
Harvest() cannot solve this problem because the speech has low periodicity. Perhaps, a part of voiced section has been wrongly identified as the unvoiced section. It is difficult to manipulate parameters in Harvest(), so I recommend you to use Dio() instead of Harvest().
i try to process an audio file with high pitch,like a song. But the output has too many noise. The problem seems to be in the calculation of F0 or somewhere.
Any help will be appreciated.
sample file: hh_high.zip