swasun / VQ-VAE-Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
MIT License
264 stars 53 forks source link

VQ-VAE-encoder + WaveNet decoder usage #5

Closed shiva1393 closed 5 years ago

shiva1393 commented 5 years ago

In readme i didn't find any steps for VQ-VAE-encoder + WaveNet decoder .Please can any one help me to proceed further.

roberthoenig commented 5 years ago

From what I can tell, the VQ-VAE-encoder + WaveNet decoder setup is work in progress. Looking at the code, the WaveNet decoder itself seems to be readily implemented, but not yet hooked into a PipelineFactory for training it. The last activity in this repository was two months ago, so it might be unmaintained. Perhaps @swasun can shed some light?

swasun commented 5 years ago

I had to move on other projects and this one was stopped prematurely. Unfortunately I don't have the time to update it for now, though it's almost done. I will maybe give it a shot later.

roberthoenig commented 5 years ago

@swasun I need a working implementation of this model with a WaveNet decoder, so I'd be happy to give the remaining implementation a shot. Is there anything else to be done, other than updating the pipeline factory? Also, has the WaveNet decoder been tested yet?

swasun commented 5 years ago

@roberthoenig The WaveNet decoder code comes from https://github.com/r9y9/wavenet_vocoder which seems to be a good implementation. I didn't test it on my own repository, because I worked on another private repository where the wavenet was already implemented, and I let this one open in case someone needs it.

shiva1393 commented 5 years ago

Hai @swasun, i combined both wavenet + vqvae (wavenet r9v9/wavenet_vocoder)..But vqloss going towards zero in fewsteps and activating less indices (from embedding table for k=128 it activating around 30 indices only)... So wavenet loss not decreasing.... to overcome this problem any suggestions can you give.... Here iam showing loss functions ... e_latent_loss = torch.mean((quantized.detach() - inputs)2) q_latent_loss = torch.mean((quantized - inputs.detach())2) commitment_loss = self._commitment_cost * e_latent_loss vq_loss = q_latent_loss + commitment_loss wavenet_loss= criterion(y_hat[:, :, :-1], y[:, 1:, :], mask=mask) loss2=vq_loss +wavenet_loss loss2.backward().......

swasun commented 5 years ago

@shiva1393, @HenryZhou7 Hello guys. Sorry no time to check on that yet, I have too much to-do with my new position. Here's a repo where someone is actively working on that: https://github.com/hrbigelow/ae-wavenet