Open Ronald-Kray opened 3 years ago
training takes 3 times memory as inference
@zylo117 those are not exactly 3 times memory as inference. How can I calculate how much memory consumption is used for training?
@zylo117 Hey! I have a question about memory consumption. I'm using for training Yolov5 and EfficientDet on 3 X GTX 2080 Ti 11G. Any idea why Yolov5 can allocate more batch size than EfficientDet and why EfficientDet requires so much memory for training? It would be clear if I could directly calculate the allocated memory.
no idea. But it should have something to do with the cuda implementation of depthwiseconv and swish. In practice, the effdet consumes only 100MB in naive c++ impl
@zylo117 Hi, I'm testing the required training memory on a GTX 2080 Ti. Could you check Mib(D0~D4) for 1 batch size, please?
EfficientDet-D0(15.5MB) | 1 batch:1,326Mib EfficientDet-D1(26.4MB) | 1 batch:2,068Mib EfficientDet-D2(32.2MB) | 1 batch:2,840Mib EfficientDet-D3(47.7MB) | 1 batch:4,824Mib EfficientDet-D4(81.9MB) | 1 batch:7,844Mib