-
Hi @vvvm23,
I saw your code today, it is novel and inspiring an I benefited a lot from it.
I'm sorry to bother you, but could you upload the .pt file you learned with pixel snail?
Since I want to…
-
In the original code from taming transformers and latent diffusion models, the weight for the discriminator adversarial loss is defined by the ratio between the nll_loss and the g_loss gradients (htt…
-
Hi @krennic999
Thank you for your excellent work!
I am very interested in the high-quality and detailed human face images you demonstrated in your paper, like the ones below:
In this [iss…
-
I downloaded some FFHQ images from
https://drive.google.com/drive/folders/1tZUcXDBeOibC6jcMCtgRRz67pzrAHeHL
Then, I ran
python /people/kimd999/script/python/cryoEM/vq-vae-2-pytorch/train_vqvae.py…
kimdn updated
4 years ago
-
Thank you for your work.
Which feature is used for the input?
-
- https://arxiv.org/abs/1711.00937
- 2017
監督なしで有用な表現を学習することは、機械学習における重要な課題である。
本論文では、このような離散的な表現を学習する、シンプルで強力な生成モデルを提案する。
我々のモデル、VQ-VAE(Vector QuantisedVariational AutoEncoder)は、2つの重要な点でVAEとは異なる…
e4exp updated
3 years ago
-
The paper mentions a codebook size of 4096 for all models with 128/64/32 tokens for 256x256 and 128/64 tokens for 512x512.
I was wondering why the example configuration in `README.md` and `titok.py` …
-
Really cool results on [the project website](https://wilson1yan.github.io/videogpt/index.html)! I wanted to give it a shot, and in this repo's readme it says:
> the VideoGPT model can be sampled us…
-
excellent work!
Sorry, I'm not sure I understand clearly. Since we trained a VAE-model to map speech into discrete codes, then using a decoder-only transformer autoregressive model train with text-pr…
-
Is it possible to update the example for vqvae to also include how to use it on raw audio and/or video data?
Thank you