Memory differs due to the matrix alignment or invisible gradient buffer tensors

Stonesjtu / pytorch_memlab

Profiling and inspecting memory in pytorch

MIT License

1.01k stars 37 forks source link

Memory differs due to the matrix alignment or invisible gradient buffer tensors #19

Closed david-macleod closed 2 years ago

david-macleod commented 3 years ago

I was just wondering what this message means in the MemReporter output

Total Tensors: 266979334        Used Memory: 924.71M
The allocated memory on cuda:0: 1.31G
Memory differs due to the matrix alignment or invisible gradient buffer tensors

Also what is the difference between Used Memory and allocated memory?

Many thanks

Stonesjtu commented 3 years ago

The gap between used memory and allocated memory comes from three aspects;

Memory layout alignment, e.g. you have a tensor with size 1024 x 125, but it takes 1024 x 128 for cache & speed optimization
Some tensor introduced in AutoGrad is not visible in python's garbage collect view, so I cannot count them w/o modifying pytorch source code. (usually this one produces most gap)

Hope it helps.