Closed ghost closed 9 months ago
Its an interesting idea, that you pre-compute the keys and values of the next encoder part. But does that give any significant memory gains?
It saves about 600MiB RAM (Float16, less if it is 8-bit model). Computation saving is negligible (you don't need to compute these on every iteration, hence computation saving).
Why is there a UNet and Unet_fixed? Is that some optimization to save memory?