-
Thanks for your creative and valuable work. When i'm retraining your coco model, i find a difference between paper and code for training.
According to appendix-A-Training Detail, for the implementa…
-
Hi, can you explain the difference between the subband and duration experiments and share which you've found to perform better? Also, what have you discovered about the use of classifier free guidance…
-
[CFG++](https://github.com/CFGpp-diffusion/CFGpp), like [CFG Rescale](https://github.com/invoke-ai/InvokeAI/pull/4335), is an attempt to address the way the linear Classifier-Free Guidance function is…
-
作者您好,感谢您的代码开源,我在使用您公布的预训练权重(LaDiC.bin)直接在测试集上测试时,在30步的情况下,发现各种评估指标与论文中的结果有微小的差距,但在CIDEr分数上差距比较明显,请问这是我的config.py(当前与仓库中的config.py保持不变)有问题还是有其他问题呢?
-
I implemented the CatVTON approach with SDXL Inpainting as the base model including DREAM. And the loss curve looks good & drops to ~0.001 after several epochs. However, the resulting images are just…
-
### Motivation
Many `lmdeploy` counterparts(vllm, transformers, exllamav2...) provide `logits_processors` that allow users to modify the logits before softmax. This enables many useful features like …
-
I was reading the paper [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/abs/2305.08891) and found it pretty interesting. It proposes a few simple changes that could be…
-
Hi authors,
Thanks for sharing this interesting work.
I am curious about the relation between rfid and gfid presented by the recent works Maskbit and RAR. I noticed that the gfid can be signific…
-
Line 180 here fails in stage 1 training.
https://github.com/fudan-generative-vision/champ/blob/02a9a24a9183727dcbb8eb432b46b3a19302bcb8/models/mutual_self_attention.py#L166-L186
I had to do
…
-
Thanks for the open source! I've noticed that in `v_express_pipeline.py`, you use classifier free guidance to audio embeddings, however, the technique report doesn't seem to mention the audio embeddin…