Hardcode in `AttentionBlock`

openai / guided-diffusion

MIT License

6.11k stars 813 forks source link

Hardcode in `AttentionBlock` #114

Closed cuttle-fish-my closed 1 year ago

cuttle-fish-my commented 1 year ago

Hi! Thanks for this fantastic work!

I am little confused about the forward function of AttentionBlock in unet.py. The corresponding codes are shown as blew:

 def forward(self, x):
        return checkpoint(self._forward, (x,), self.parameters(), True)

I just wonder why we should hardcode the flag parameter to True instead of usingself.use_checkpoints?

Thanks!

wpMelene commented 1 year ago

I think it is because this is a pixel-wise self attention implementation, which considered one pixel as a token. This implementation will be extremely memory comsuming when input resolution is relatively high (e.g. 256^2).