-
If we tokenise frames of a video with a VQGAN, we can autoregressively predict the next token using our current language model. More specifically, using our current context of 2 million tokens, we cou…
-
Hi,
I ran the code as you suggested and completed all 420 epochs for hier top and bottom but the generated results are not good, you can see below. Please suggest me what should I do to generate go…
-
In the VQ-VAE model, you use both n_codes and d_latent. May I ask what is the difference between both?
https://github.com/dvruette/figaro/blob/1c6262308c8d4cf4a7657112af20ae8040d267c0/src/models/va…
-
Hello, Can your code implement conditional sampling?
-
Hi,
just now I successfully ran the train_pixelsnail.py on the top levels (size: 8×8) of my own dataset, which consists of 300,000 encoded outcome from images (size: 64×64) of different sizes, rotati…
-
Hello, I have a question regarding the method used to compute the number of motion. While investigating your Motiondataset and dataloader, it appears that **the motion numbers during training are coun…
-
### Checklist
- [X] The issue exists after disabling all extensions
- [ ] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a …
-
Hello and thank you for this repo.
I was wondering, if there is a reason to use a 3-dim embedding instead of a 2-dim codebook.
Is the idea to achieve some from of multi head gumbel sampling? T…
-
I tried generating output from the data provided with the paper and I found that there are consistent jumps between frames. Most of the times it occurs every 32 frames but it also occurs at interval o…
-
### Checklist
- [ ] The issue exists after disabling all extensions
- [ ] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a …