r9y9 / nnmnkwii

Library to build speech synthesis systems designed for easy and fast prototyping.
https://r9y9.github.io/nnmnkwii/latest/
Other
393 stars 73 forks source link

Bug in parameter generation #95

Closed hyama5 closed 4 years ago

hyama5 commented 4 years ago

Hello, I found a bug in paramgen.mlpg. Specifically, the generated parameters of the beginning and end of utterance become small values even if the static mean has a large value. Following Google Colab is an example of the strange MLPG behavior. https://colab.research.google.com/drive/1C5TzPjaDRwDKuOV_XmeCmMAnX_QxEH3P

This might be caused by using the distributions of dynamic features of the first (t=0) and final (t=T-1) frames for MLPG, although these distributions cannot be defined without using the values of frames t=-1 and t=T. Merlin overcomes this problem by giving a very large value (100000000000) to the variance of the first and final frames. https://github.com/CSTR-Edinburgh/merlin/blob/master/src/frontend/mlpg_fast.py

If you don't mind, I'll make PR to fix this issue.

r9y9 commented 4 years ago

Thank you for the detailed report! Sure, I’d appreciate it if you make a PR for fixing the issue.

hyama5 commented 4 years ago

I fixed MLPG by changing the precision of the frames of the beginning and end. This code assumes that the first window is a static feature. I'm not sure it works in MGE training and other modules.

r9y9 commented 4 years ago

fixed by #96

r9y9 commented 4 years ago

I want to include the fixes to the release as soon as possible, so I'm going to make a release after #98 merged.