facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Other
6.37k stars 569 forks source link

Performance for patch size = 1 #86

Open NrealWJX opened 6 months ago

NrealWJX commented 6 months ago

Thank you for your excellent work!

I was wondering about the performance for patch size p = 1, which was not shown in your paper. Can you please explain why this was not experimented with? Was it due to memory constraints on a single TPU-v3 with a batch size of 1?

Looking forward to your reply! :)