Some questions about the subjective evaluation (MOS chart)

kan-bayashi / PytorchWaveNetVocoder

WaveNet-Vocoder implementation with pytorch.

Apache License 2.0

297 stars 57 forks source link

Hi @unilight. Thank you for your question! You can listen the samples from here. Maybe as you listen, samples of wnv are almost same as raw speech. In the subjective evaluation, the feeling of STRAIGHT samples are definitely different from wnv samples, therefore, subjects tend to set low score. Futhermore, because we want to compare the performance as vocoder, the setting of feature extraction is same for both STRAIGHT and WaveNet vocoder (5ms shift, 24 order mcep). This causes the performance degradation of STRAIGHT. If we use short shift size full spectrum for STRAIGHT, the performance become better.

kan-bayashi / PytorchWaveNetVocoder

Some questions about the subjective evaluation (MOS chart) #38