Open Ethan-yt opened 2 years ago
Good observation. I would appreciate any PRs addressing this issue!
looks like it neglects the intermediate tensors stored for backward propagation Estimated: Forward/backward pass size (MB): 83.89 = 102401024(input shape)4(byte)*2(grad)=83,886,080
torch reported: 4044.00390625MB = 4.003MB(W and b) + 40MB(input) + 40MB(intermediate)* 100(iterations)
This issue should be resolved in #181, but I'll wait for the release and a verification before closing
Describe the bug Memory estimates are inconsistent with actual GPU usage for recursive models
To Reproduce Example code:
torch reported: 4044.00390625
Input size (MB): 41.94 Forward/backward pass size (MB): 83.89 Params size (MB): 4.20 Estimated Total Size (MB): 130.03
If i change
100
to1
all looks good:torch reported: 92.01171875
Input size (MB): 41.94 Forward/backward pass size (MB): 83.89 Params size (MB): 4.20 Estimated Total Size (MB): 130.03