Hey,
First thank you for your great repo!
I'm trying to run inference via c++ libtorch, when I'm running on my laptop(rtx 2060, cuda 11.3) everything work fine,
but when running on the deployment PC(gtx 1650) getting some different results for different runs:
yolov5m, cuda 11.3, libtorch-cxx11-abi-shared-with-deps-1.10.0+cu113: inference run but the model cant detect anything
yolov5m, cuda 10.2, libtorch-cxx11-abi-shared-with-deps-1.10.1+cu102: inference run, there is detections but post-process takes 160ms vs 4ms of inference (torch::masked_select takes too long)
yolov5m, cuda 10.2, libtorch from source inference run, there is detections but post-process takes 110ms vs 10ms of inference
yolov5s, cuda 10.2, libtorch from source: everything for fine in realtime(camera produce 30fps)
**the python .pt works fine with any cuda version
Hey, First thank you for your great repo! I'm trying to run inference via c++ libtorch, when I'm running on my laptop(rtx 2060, cuda 11.3) everything work fine, but when running on the deployment PC(gtx 1650) getting some different results for different runs:
yolov5m, cuda 11.3, libtorch-cxx11-abi-shared-with-deps-1.10.0+cu113: inference run but the model cant detect anything yolov5m, cuda 10.2, libtorch-cxx11-abi-shared-with-deps-1.10.1+cu102: inference run, there is detections but post-process takes 160ms vs 4ms of inference (torch::masked_select takes too long) yolov5m, cuda 10.2, libtorch from source inference run, there is detections but post-process takes 110ms vs 10ms of inference yolov5s, cuda 10.2, libtorch from source: everything for fine in realtime(camera produce 30fps) **the python .pt works fine with any cuda version
any ideas what can cause this?