Open ckurtz22 opened 1 year ago
Could be related: when using FP16 mode it uses the wrong size for allocating WgsLayers: https://github.com/ceccocats/tkDNN/blob/master/src/LayerWgs.cpp#L76 (should be using b_size, but uses w_size instead)
on RTX 2070,cuda11.8,FP32 mode, have same problem,
Getting an error when running test_yolo4 on my laptop which has an RTX 3060 GPU:
Running on commit d4f7b4ad8b21f1af78e1bada3e0368c0c1304ad9