why -4 - Githubissues

zceng / LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Apache License 2.0

79 stars 16 forks source link

Closed hdmjdp closed 3 years ago

hdmjdp commented 3 years ago

if I pad 2

zceng commented 3 years ago

Because of the length correlation between input waveforms and mel-spectrum.

As shown in above, the length of the input waveform (audio) is equal to that the length of mel-spectrum minus 4 and multiply by the hop_length.

zceng commented 3 years ago

Similar process can find in Parallel WaveGAN.

hdmjdp commented 3 years ago

ok. In process data, I did not minus 4. So in my version, I think no need to "cond_length - 4 ) ".