Inquiry for training memory on GTX 2080 Ti (D0~D4)

Ronald-Kray commented 3 years ago

@zylo117 Hi, I'm testing the required training memory on a GTX 2080 Ti. Could you check Mib(D0~D4) for 1 batch size, please?

zylo117 commented 3 years ago

training takes 3 times memory as inference

Ronald-Kray commented 3 years ago

@zylo117 those are not exactly 3 times memory as inference. How can I calculate how much memory consumption is used for training?

Ronald-Kray commented 3 years ago

@zylo117 Hey! I have a question about memory consumption. I'm using for training Yolov5 and EfficientDet on 3 X GTX 2080 Ti 11G. Any idea why Yolov5 can allocate more batch size than EfficientDet and why EfficientDet requires so much memory for training? It would be clear if I could directly calculate the allocated memory.

zylo117 commented 3 years ago

no idea. But it should have something to do with the cuda implementation of depthwiseconv and swish. In practice, the effdet consumes only 100MB in naive c++ impl

zylo117 / Yet-Another-EfficientDet-Pytorch

Inquiry for training memory on GTX 2080 Ti (D0~D4) #673