Open AmericanPresidentJimmyCarter opened 2 months ago
i was working on this support but after seeing the results of the model i'm not sure it's ready to be added yet:
there's a lot of residual noise - like, a lot.. it reminds me of Pixart Sigma's similar issues
i was working on this support but after seeing the results of the model i'm not sure it's ready to be added yet:
there's a lot of residual noise - like, a lot.. it reminds me of Pixart Sigma's similar issues
I meet a similar problem when using EDM training method. Do you have any idea about how the noise arise arises?
@JincanDeng how are you doing caption dropout? zeroes or ""
prompt encoded by both TEs?
@bghira I use ""
prompt with 0.1
probability for dropout.
SDXL relies on zeroing uncond space, try using torch.zeros_like()
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
https://huggingface.co/Alpha-VLLM/Lumina-T2I/blob/main/README.md https://arxiv.org/pdf/2405.05945