Then
python3 export_yoloV8.py -w testbest.pt --dynamic
then
Copy the generated ONNX model file and labels.txt file (if generated) to the DeepStream-Yolo folder.
I'm launching it
CUDA_VER=12.1 make -C nvdsinfer_custom_impl_Yolo
Already here I do not understand where the file model-engine-file=model_b1_gpu0_int8.engine should come from
So all I got was the onnx model and labels.txt
I decided to change config_infer_primary_yoloV8.txt as follows
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:659 INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation
WARNING: [TRT]: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
Building the TensorRT Engine
File does not exist: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/calib.table
OpenCV is required to run INT8 calibrator
Failed to build CUDA engine
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:728 Failed to create network using custom network creation function
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:794 Failed to get cuda engine from custom library API
0:00:04.926481923 141577 0x55e8c64cf2c0 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2022> [UID = 1]: build engine file failed
0:00:04.937754138 141577 0x55e8c64cf2c0 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2108> [UID = 1]: build backend context failed
0:00:04.937771198 141577 0x55e8c64cf2c0 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1282> [UID = 1]: generate backend failed, check config file settings
0:00:05.107022653 141577 0x55e8c64cf2c0 WARN nvinfer gstnvinfer.cpp:898:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:05.107039197 141577 0x55e8c64cf2c0 WARN nvinfer gstnvinfer.cpp:898:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:716>: Failed to set pipeline to PAUSED
Quitting
nvstreammux: Successfully handled EOS for source_id=0
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(898): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed
obviously, the calib.table file was not found (because it is not kek)
Therefore, I'm going to INT8Calibration.md
I'm doing
apt-get install libopencv-dev
then I do clean just in case
CUDA_VER=12.1 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo clean
then
CUDA_VER=12.1 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo
at this stage, I still don't have calib.table and model_b1_gpu0_int8.engine
next, I create a folder with calibration images and create a txt file with paths to these images
setting the export variables
after that, it starts calibration and outputs the following
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:659 INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation
WARNING: [TRT]: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
Building the TensorRT Engine
File does not exist: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/calib.table
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
Load image: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/calibration/frame000042_07_18_2023_03_00_457947_SRC.jpg
Progress: 0.1%
Load image: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/calibration/frame000042_07_18_2023_03_00_571936_SRC.jpg
Progress: 0.2%
Load image: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test3/calib/DeepStream-Yolo/calibration/frame000042_07_18_2023_03_00_780444_SRC.jpg
Please pay attention to the warnings, they will be throughout the work
After that, I still get calib.table and model_b1_gpu0_int8.engine
I run my deepstream pipline with the created calibration files, my config looks like this:
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: [TRT]: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.
Building the TensorRT Engine
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Squeeze_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Squeeze_1_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Squeeze_2_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 367) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 368) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 378) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 382) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 383) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 393) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Reshape_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Reshape_1_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 442) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 443) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 453) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 457) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 458) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 468) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Reshape_5_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Reshape_6_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 517) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 518) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 528) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 532) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 533) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 543) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Reshape_10_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/Reshape_11_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /0/model.22/dfl/Softmax_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 705) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor /1/ArgMax_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
and I get very serious warnings, also the pipeline runs for a very long time, and when it does run, the performance is very low on the order of 2-3 FPS
Moreover, it works faster without calibration
My GPU
GPU GTX 1070
I developing on Docker The versions I'm working with
I'm running this
Then
python3 export_yoloV8.py -w testbest.pt --dynamic
then Copy the generated ONNX model file and labels.txt file (if generated) to the DeepStream-Yolo folder.
I'm launching it
CUDA_VER=12.1 make -C nvdsinfer_custom_impl_Yolo
Already here I do not understand where the file model-engine-file=model_b1_gpu0_int8.engine should come fromSo all I got was the onnx model and labels.txt
I decided to change config_infer_primary_yoloV8.txt as follows
and I get the following set of errors
obviously, the calib.table file was not found (because it is not kek)
Therefore, I'm going to INT8Calibration.md
I'm doing
apt-get install libopencv-dev
then I do clean just in case
CUDA_VER=12.1 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo clean
then
CUDA_VER=12.1 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo
I get the following output
at this stage, I still don't have calib.table and model_b1_gpu0_int8.engine next, I create a folder with calibration images and create a txt file with paths to these images setting the export variables
Next it says Edit the config_infer file
I have not found any file called config_infer, there is a file called config_infer_primary.txt that's why I'm editing it.
after that, it starts calibration and outputs the following
Please pay attention to the warnings, they will be throughout the work
After that, I still get calib.table and model_b1_gpu0_int8.engine I run my deepstream pipline with the created calibration files, my config looks like this:
and I get very serious warnings, also the pipeline runs for a very long time, and when it does run, the performance is very low on the order of 2-3 FPS Moreover, it works faster without calibration
How to fix it, what is the problem?