-
first of all, thx for implementation!
the question is about proper masking inside the model
1. shift_down and shift right in the beginning of PixelSnail module have already taken care of masking…
-
Hi,
Cool project!
When I was trying VQVAE I found that using a moving average like described in the appendix trained a lot faster and gave better results! There is a [zalandoresearch](https://gi…
-
Hello, thank you very much for your code and videos!
I'm using this code repository to train on the flowers dataset with a batch size of 32 for 200 epochs, but the reconstructed images still only hav…
-
Thank you for your work.
Which feature is used for the input?
-
I am training from scratch on all FFHQ dataset (70k), with `base learning rate` as 1.0e-06, and I use the `scale_lr=True` parameter.
1. But the training process seems very oscillating. With the s…
-
Hello! May I ask if you used VQ-16 similar perceptual loss or just reconstruction loss and KL divergence when training KL-VAE?
-
It would be great if `ContinuousTransformerWrapper` supported `return_mems` in the forward pass.
Thank you for the awesome repo!
Remarkably, it all works with `torch.onnx.export()`!
-
I tried A100 (40GB SXM4) with 30 vCPUs, 200 GiB RAM, 512 GiB SSD but immediately CUDA out of memory.
which card / config shall i use? 8x A100 80GB? 1x H100 80GB? 8x H100 80GB?
torch.cuda.OutOfMe…
-
the paper emphasizes joint training of sparse encoder and dense VQ-VAE to optimize the codebook and improve generalization. But in this code joint training has not been done right? is there any reason…
-
are there any detailed informations to all the parameters in the config files and how they affect the audio?
```
conf/mlfb_vqvae.yml
cobf/mflb_vqvae.yml
```
I left it all on default and trained 2…