-
Thanks for your excellent work of diffusion-based image-caption.
However, I am confused about the alpha schedular and the q_pred function. I did not find any useful paragraph introducing this proc…
-
In #101 a model file was introduced.
We should document how to regenerate it and remove it from the repository.
-
Create a CLIP-like embedding+tokenizer for our usecase that can be hooked up to stable diffusion pipeline
-
Hi,
First of all thanks for the code! I was wondering if you were able to make it converge on 256x256 images. Specifically I am using ImageNet (a smaller version of it with 9469 samples). I cannot …
-
Add TransformerInferer class that is able to perform the sampling process, the forward pass, and the get probability methods for autoregressive transformers. Together with a VQVAE or a VQGAN, it shoul…
-
### Describe the bug
When trying to load a pretrained `VQDiffusionPipeline`, it says that `learned_classifier_free_sampling_embeddings` is not passed. I also looked at the cache directory, and the co…
-
-
The only difference is that they use Masked tokens while they use noised tokens
https://arxiv.org/pdf/2211.07292.pdf
-
Thank you for releasing the code of this excellent work!
Regarding the below function, I couldn't figure out the usage of the q_pred function in L215 and L237.
https://github.com/cientgu/VQ-Diff…
-
Hello, I am a little confused about your implementation [here](https://github.com/buxiangzhiren/DDCap/blob/main/train.py#L359).
```python
def forward(self, tokens: torch.Tensor, mask_tokens: torch…