mmorise / World

A high-quality speech analysis, manipulation and synthesis system
http://www.kisc.meiji.ac.jp/~mmorise/world/english
Other
1.17k stars 251 forks source link

Too many noise while processing high pitch audio #83

Closed vaxt closed 5 years ago

vaxt commented 5 years ago

i try to process an audio file with high pitch,like a song. But the output has too many noise. The problem seems to be in the calculation of F0 or somewhere.

Any help will be appreciated.

sample file: hh_high.zip

mmorise commented 5 years ago

General vocoder-based system including WORLD cannot analyze the singing voice containing music. The system requires the clean singing without music or noise.

vaxt commented 5 years ago

Thanks for replying. The output of those audio without music are better. But even clear voice sometime has bad result. Since the pitch is high,I set the f0_ceil in harvest up to 1100.That will make the output a little bit better. Is there anything i can do to improve the result? I upload a new sample file below, i think that should be clean enough.

new sample file: wa.zip

mmorise commented 5 years ago

I confirmed that we cannot estimate the F0 contour from the speech. You can solve this problem by using Dio() and StoneMask() with appropriate options. In Dio(), please set f0_ceil and f0_floor to 1200 and 400, respectively.

Harvest() cannot solve this problem because the speech has low periodicity. Perhaps, a part of voiced section has been wrongly identified as the unvoiced section. It is difficult to manipulate parameters in Harvest(), so I recommend you to use Dio() instead of Harvest().