Open yuanzhi-zhu opened 6 months ago
i also found sample.py script always give same result image under same label. In the workflow of DiTBlock, i wonder there is no cross attention , so i guess the variation ability may be a challange to DiT?
i also found sample.py script always give same result image under same label. In the workflow of DiTBlock, i wonder there is no cross attention , so i guess the variation ability may be a challange to DiT?
Hi @tanghengjian, I do not know if your question is related to the cifar expr, but did you change the seed in the sample.py script? https://github.com/facebookresearch/DiT/blob/ed81ce2229091fd4ecc9a223645f95cf379d582b/sample.py#L23
run with default value. by the way, i found cifar10 dataset is only 32*32 pixel with 10 classes, it means the y condition changes from 0 to 9. do you have tested the mscoco dataset in DiT model with label condition?
run with default value. by the way, i found cifar10 dataset is only 32*32 pixel with 10 classes, it means the y condition changes from 0 to 9. do you have tested the mscoco dataset in DiT model with label condition?
How do you link CIFAR10 classes to the ImageNet 1k classes?
have you tried to run DiT on CIFAR10 dataset? I did some simple expr and found that DiT does not work well on CIFAR10.