ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
718 stars 208 forks source link

Cuda failure: out of memory #303

Open ckurtz22 opened 1 year ago

ckurtz22 commented 1 year ago

Getting an error when running test_yolo4 on my laptop which has an RTX 3060 GPU:

Cuda failure: out of memory
/home/connor/tkDNN/src/Conv2d.cpp:176

Running on commit d4f7b4ad8b21f1af78e1bada3e0368c0c1304ad9

ckurtz22 commented 1 year ago

Could be related: when using FP16 mode it uses the wrong size for allocating WgsLayers: https://github.com/ceccocats/tkDNN/blob/master/src/LayerWgs.cpp#L76 (should be using b_size, but uses w_size instead)

yemuzi commented 10 months ago

on RTX 2070,cuda11.8,FP32 mode, have same problem, image