Closed harpavatkeerti closed 6 months ago
Hi @harpavatkeerti
Thanks for your interest in our paper!
cache_block_id
corresponds to the n-th residual path in each layer. After defining the cache layer by cache_layer_id
, we then choose the n-th residual path within that layer by cache_block_id
. As for Stable Diffusion 1.5, the maximum cache_block_id
for each layer is 2. Because it is generally divided by different block structures as residuals, we name it cache_block_id
.
The name "cache_block_id" seems to cause difficulties in understanding. We would consider to change a name for this. Thanks for your good question!
Thanks for the clarification @horseee. So, it's a more granular level of control for caching the output of a particular residual path in a layer, if it's the last one, then it's the output of the whole layer, which is detailed in the paper. Else the cache is retrieved from that particular block instead of the previous layer. Is this understanding correct?
Hi @harpavatkeerti
it's a more granular level of control for caching the output of a particular residual path in a layer
Correct
if it's the last one, then it's the output of the whole layer, which is detailed in the paper
If the U-Net has some middle-layer blocks, then the middle-layer blocks would still not be included by choosing the last path.
Else the cache is retrieved from that particular block instead of the previous layer.
Correct!
Ok, thanks a lot!
Hi, the work looks very interesting.
I was going through the code, and got most part of it, except
cache_block_id
. I didn't get it's usage from the paper, and according to the code, it seems like caching the output of a particular attention block within a UNet layer.Could you please provide some insight on this. Thanks!