GPU batch size of 3d-unet for RTX 3080

mahmoodn commented 2 years ago

Using RTX 3080 and a batch size of 2 or 4 for 3d-unet, I get these messages

[0xc62a32d0]:151 :ScratchObject in storeCachedObject: at optimizer/gpu/cudnn/convolutionBuilder.cpp: 170 idx: 33716 time: 7e-08
[0xc61fba40]:151 :ScratchObject in storeCachedObject: at optimizer/gpu/cudnn/convolutionBuilder.cpp: 170 idx: 27179 time: 1.5e-07
-------------- The current device memory allocations dump as below --------------
[0]:8589934592 :HybridGlobWriter in reserveMemory: at optimizer/common/globWriter.cpp: 399 idx: 1610 time: 0.0090752
[0x302000000]:4296015872 :HybridGlobWriter in reserveMemory: at optimizer/common/globWriter.cpp: 377 idx: 391 time: 0.00240856
[0x7fed34000000]:134217728 :DeviceActivationSize in reserveNetworkTensorMemory: at optimizer/common/tactic/optimizer.cpp: 4664 idx: 8 time: 0.000153513
[07/18/2022-17:03:01] [TRT] [W] Requested amount of GPU memory (8589934592 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[07/18/2022-17:03:01] [TRT] [W] Skipping tactic 5 due to insuficient memory on requested size of 8589934592 detected for tactic 61.
Try decreasing the workspace size with IBuilderConfig::setMemoryPoolLimit().
[07/18/2022-17:03:04] [TRT] [I] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes.

The run continues, so I am not sure if such messages are important. If I set the batch size to 1, there is no such memory message. But again I am not sure if that is a correct setting. Any thoughts on that?

nv-ananjappa commented 2 years ago

@nv-jinhosuh Any idea about this?

nv-jinhosuh commented 2 years ago

@mahmoodn For 3D-UNet, due to samples being in non-uniform sizes/shapes, batchsize == 1 is only supported. There is a possible improvement in supporting batchsize > 1 for 'sliding window inference', but this is not supported for v2.0 codebase.

mahmoodn commented 2 years ago

Thanks a lot. I got it.

mlcommons / inference_results_v2.0

GPU batch size of 3d-unet for RTX 3080 #9