openai / guided-diffusion

MIT License
6.06k stars 807 forks source link

Any plan to release "optimized version" according to Appendix A? #6

Closed AranKomat closed 3 years ago

AranKomat commented 3 years ago

I believe the current version is of "naive" implementation, as the number I got from benchmarking is close to it.

I tried FusedAdam from Apex, but it didn't improve the throughput much, so either that's not what you used, or fused GroupNorm-Swish has a bigger benefit.

Do you have any plan to release thie optimized version or any code snippet one can use to improve the code?

unixpickle commented 3 years ago

We currently do not have plans to release the optimized version.

Note that a significant amount of the speedup comes from using larger per-GPU batch sizes, which this codebase should be capable of given enough GPU memory. One subtle corollary is that fused ops and more-efficient Adam will use less memory, allowing a larger batch size to fit on the same GPU.