yolov8 tensorrt C++ classification not working correctly in Jetson device

triple-Mu / YOLOv8-TensorRT

YOLOv8 using TensorRT accelerate !

MIT License

1.32k stars 228 forks source link

yolov8 tensorrt C++ classification not working correctly in Jetson device #216

Closed khanh0812 closed 4 months ago

khanh0812 commented 4 months ago

Hi @triple-Mu , many thanks for your works, I tried convert classification model to ONNX and then TensorRT, but when deploying C++ in Jetson device, the result was not correct as same as running in Python. Do you have any recommendation?

triple-Mu commented 4 months ago

Hi @triple-Mu , many thanks for your works, I tried convert classification model to ONNX and then TensorRT, but when deploying C++ in Jetson device, the result was not correct as same as running in Python. Do you have any recommendation?

Do you align the preprocess?

khanh0812 commented 4 months ago

I mean I was using original .pt yolov8 to classify, then compare to the result from C++, I followed the code in yolov8-cls.hpp void YOLOv8Cls::copy_from_Mat(cv::Mat& image, cv::Size& size) { cv::Mat nchw; cv::dnn::blobFromImage(image, nchw, 1 / 255.f, size, cv::Scalar(0, 0, 0), true, false, CV_32F); // this->letterbox(image, nchw, size); std::cout << "Original image size: " << image.size() << std::endl; std::cout << "Blob size: " << nchw.size[0] << " x " << nchw.size[1] << " x " << nchw.size[2] << " x " << nchw.size[3] << std::endl; this->context->setBindingDimensions(0, nvinfer1::Dims{4, {1, 3, size.height, size.width}}); CHECK(cudaMemcpyAsync( this->device_ptrs[0], nchw.ptr(), nchw.total() * nchw.elemSize(), cudaMemcpyHostToDevice, this->stream)); }

Do we need similar script like export-det to export onnx model, or use script provided from Ultralytics?

Many thanks

triple-Mu commented 4 months ago

I mean I was using original .pt yolov8 to classify, then compare to the result from C++, I followed the code in yolov8-cls.hpp

void YOLOv8Cls::copy_from_Mat(cv::Mat& image, cv::Size& size)

{
cv::Mat nchw;

cv::dnn::blobFromImage(image, nchw, 1 / 255.f, size, cv::Scalar(0, 0, 0), true, false, CV_32F);

// this->letterbox(image, nchw, size);

std::cout << "Original image size: " << image.size() << std::endl;

std::cout << "Blob size: " << nchw.size[0] << " x " << nchw.size[1] << " x " << nchw.size[2] << " x " << nchw.size[3] << std::endl;

this->context->setBindingDimensions(0, nvinfer1::Dims{4, {1, 3, size.height, size.width}});

CHECK(cudaMemcpyAsync(

    this->device_ptrs[0], nchw.ptr<float>(), nchw.total() * nchw.elemSize(), cudaMemcpyHostToDevice, this->stream));
}

Do we need similar script like export-det to export onnx model, or use script provided from Ultralytics?

Many thanks

Ultralytics cls preprocess contains centercrop. But it is not convenient in cpp.

khanh0812 commented 4 months ago

@triple-Mu so I need to add centercrop in Cpp as preprocess, am I right?

triple-Mu commented 4 months ago

@triple-Mu so I need to add centercrop in Cpp as preprocess, am I right?

Maybe. You can save the model input tensor both in Python and cpp and compare them for sure they are the same.

khanh0812 commented 4 months ago

I added CenterCrop preprocess and it worked well similar to original Python. Many thanks. I close the issue

triple-Mu commented 4 months ago

I added CenterCrop preprocess and it worked well similar to original Python. Many thanks. I close the issue

BTW Could you please pr for this change? I would be extremely grateful!

khanh0812 commented 4 months ago

Sure, I'll do it soon