Closed wangxdgg closed 1 year ago
It's the activation compression, no weight memory here. When one activation tensor is consumed by all its consumers, its memory can be freed to the next layer's activation tensor.
So can it be interpreted as peak activation memory usage?
You can say it's the peak activation memory usage with dynamic activation allocation.
OK,I got you, thanks a lot !
I have some questions when looking at the memory compression section, is the middle layer memory compressed here, or weight memory? How did you achieve such a high compression ratio.