r9y9 / pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).
http://pysptk.readthedocs.io/en/latest/
Other
439 stars 79 forks source link

pysptk.mgc2sp vs spkt mgc2sp: different results? #57

Closed cveaux closed 5 years ago

cveaux commented 6 years ago

Hi, it's a bit a continuation from the issue raised with pyworld. I've made a vocoder module based on pyworld / pysptk and noticed that the re-synthesised speech (simple copy synthesis) sounded like it was low pass filtered (in addition to some artefacts introduced by pyworld). I traced this to a problem with the conversion of the mcep coefficients to the spectral envelope. I made a comparison between sptk and pysptk and found that the spectral envelope resynthesised by pysptk.mg2sp is lower in the high frequencies than both the original spectral envelope and the resynthesised envelope with sptk. Here's the comparison:

demo_mcep_encoding

r9y9 commented 6 years ago

Hi, sorry for late. I added a regression test #58 and it seems ok to me. What about

sp_resyn_pysptk = np.exp(log_sp_resyn_pysptk * 2)

instead of

sp_resyn_pysptk = np.power(10, log_sp_resyn_pyspk)

?

side note: please post copy-and-pastable code, not screenshot. It's hard to debug:(

cveaux commented 6 years ago

Hi, thanks for your reply, no worries for the delay, we have all our time constraints and I'm also late in my reply.

Indeed the difference that I got is certainly due to the pow10 used in the conversion of the output of mgc2sp. I had a hard time figuring out what was the value returned by the pysptk version of mgc2sp as it's not the same as the standard sptk version. This pow10 did look very suspicious to me but it was somehow what seemed to give the best match of the spectral envelope. The conversion formula that you're using makes more sense.

Sorry for the screenshots, I should have put only the figures and paste the code.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.