NVIDIA-AI-IOT / sdg_pallet_model

A pallet model trained with SDG optimized for NVIDIA Jetson.
Other
50 stars 9 forks source link

'NoneType' object has no attribute 'get_binding_index' after converting pallet_model_v1_all.onnx to pallet_model_v1_all.engine #7

Closed monajalal closed 8 months ago

monajalal commented 8 months ago
(sdgpose) mona@ada:~/sdg_pallet_model/torch2trt$ python3 setup.py develop
running develop
/home/mona/anaconda3/envs/sdgpose/lib/python3.10/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  easy_install.initialize_options(self)
/home/mona/anaconda3/envs/sdgpose/lib/python3.10/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
running egg_info
writing torch2trt.egg-info/PKG-INFO
writing dependency_links to torch2trt.egg-info/dependency_links.txt
writing top-level names to torch2trt.egg-info/top_level.txt
reading manifest file 'torch2trt.egg-info/SOURCES.txt'
adding license file 'LICENSE.md'
writing manifest file 'torch2trt.egg-info/SOURCES.txt'
running build_ext
Creating /home/mona/anaconda3/envs/sdgpose/lib/python3.10/site-packages/torch2trt.egg-link (link to .)
torch2trt 0.4.0 is already the active version in easy-install.pth

Installed /home/mona/sdg_pallet_model/torch2trt
Processing dependencies for torch2trt==0.4.0
Finished processing dependencies for torch2trt==0.4.0
(sdgpose) mona@ada:~/sdg_pallet_model/torch2trt$ cd ..
(sdgpose) mona@ada:~/sdg_pallet_model$ ./build_trt_fp16.sh pallet_model_v1_all.onnx pallet_model_v1_all.engine
&&&& RUNNING TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=pallet_model_v1_all.onnx --minShapes=input:1x3x192x192 --maxShapes=input:1x3x1536x1536 --optShapes=input:1x3x256x256 --saveEngine=pallet_model_v1_all.engine --fp16
[01/29/2024-16:34:06] [I] === Model Options ===
[01/29/2024-16:34:06] [I] Format: ONNX
[01/29/2024-16:34:06] [I] Model: pallet_model_v1_all.onnx
[01/29/2024-16:34:06] [I] Output:
[01/29/2024-16:34:06] [I] === Build Options ===
[01/29/2024-16:34:06] [I] Max batch: explicit batch
[01/29/2024-16:34:06] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[01/29/2024-16:34:06] [I] minTiming: 1
[01/29/2024-16:34:06] [I] avgTiming: 8
[01/29/2024-16:34:06] [I] Precision: FP32+FP16
[01/29/2024-16:34:06] [I] LayerPrecisions: 
[01/29/2024-16:34:06] [I] Layer Device Types: 
[01/29/2024-16:34:06] [I] Calibration: 
[01/29/2024-16:34:06] [I] Refit: Disabled
[01/29/2024-16:34:06] [I] Version Compatible: Disabled
[01/29/2024-16:34:06] [I] TensorRT runtime: full
[01/29/2024-16:34:06] [I] Lean DLL Path: 
[01/29/2024-16:34:06] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[01/29/2024-16:34:06] [I] Exclude Lean Runtime: Disabled
[01/29/2024-16:34:06] [I] Sparsity: Disabled
[01/29/2024-16:34:06] [I] Safe mode: Disabled
[01/29/2024-16:34:06] [I] Build DLA standalone loadable: Disabled
[01/29/2024-16:34:06] [I] Allow GPU fallback for DLA: Disabled
[01/29/2024-16:34:06] [I] DirectIO mode: Disabled
[01/29/2024-16:34:06] [I] Restricted mode: Disabled
[01/29/2024-16:34:06] [I] Skip inference: Disabled
[01/29/2024-16:34:06] [I] Save engine: pallet_model_v1_all.engine
[01/29/2024-16:34:06] [I] Load engine: 
[01/29/2024-16:34:06] [I] Profiling verbosity: 0
[01/29/2024-16:34:06] [I] Tactic sources: Using default tactic sources
[01/29/2024-16:34:06] [I] timingCacheMode: local
[01/29/2024-16:34:06] [I] timingCacheFile: 
[01/29/2024-16:34:06] [I] Heuristic: Disabled
[01/29/2024-16:34:06] [I] Preview Features: Use default preview flags.
[01/29/2024-16:34:06] [I] MaxAuxStreams: -1
[01/29/2024-16:34:06] [I] BuilderOptimizationLevel: -1
[01/29/2024-16:34:06] [I] Input(s)s format: fp32:CHW
[01/29/2024-16:34:06] [I] Output(s)s format: fp32:CHW
[01/29/2024-16:34:06] [I] Input build shape: input=1x3x192x192+1x3x256x256+1x3x1536x1536
[01/29/2024-16:34:06] [I] Input calibration shapes: model
[01/29/2024-16:34:06] [I] === System Options ===
[01/29/2024-16:34:06] [I] Device: 0
[01/29/2024-16:34:06] [I] DLACore: 
[01/29/2024-16:34:06] [I] Plugins:
[01/29/2024-16:34:06] [I] setPluginsToSerialize:
[01/29/2024-16:34:06] [I] dynamicPlugins:
[01/29/2024-16:34:06] [I] ignoreParsedPluginLibs: 0
[01/29/2024-16:34:06] [I] 
[01/29/2024-16:34:06] [I] === Inference Options ===
[01/29/2024-16:34:06] [I] Batch: Explicit
[01/29/2024-16:34:06] [I] Input inference shape: input=1x3x256x256
[01/29/2024-16:34:06] [I] Iterations: 10
[01/29/2024-16:34:06] [I] Duration: 3s (+ 200ms warm up)
[01/29/2024-16:34:06] [I] Sleep time: 0ms
[01/29/2024-16:34:06] [I] Idle time: 0ms
[01/29/2024-16:34:06] [I] Inference Streams: 1
[01/29/2024-16:34:06] [I] ExposeDMA: Disabled
[01/29/2024-16:34:06] [I] Data transfers: Enabled
[01/29/2024-16:34:06] [I] Spin-wait: Disabled
[01/29/2024-16:34:06] [I] Multithreading: Disabled
[01/29/2024-16:34:06] [I] CUDA Graph: Disabled
[01/29/2024-16:34:06] [I] Separate profiling: Disabled
[01/29/2024-16:34:06] [I] Time Deserialize: Disabled
[01/29/2024-16:34:06] [I] Time Refit: Disabled
[01/29/2024-16:34:06] [I] NVTX verbosity: 0
[01/29/2024-16:34:06] [I] Persistent Cache Ratio: 0
[01/29/2024-16:34:06] [I] Inputs:
[01/29/2024-16:34:06] [I] === Reporting Options ===
[01/29/2024-16:34:06] [I] Verbose: Disabled
[01/29/2024-16:34:06] [I] Averages: 10 inferences
[01/29/2024-16:34:06] [I] Percentiles: 90,95,99
[01/29/2024-16:34:06] [I] Dump refittable layers:Disabled
[01/29/2024-16:34:06] [I] Dump output: Disabled
[01/29/2024-16:34:06] [I] Profile: Disabled
[01/29/2024-16:34:06] [I] Export timing to JSON file: 
[01/29/2024-16:34:06] [I] Export output to JSON file: 
[01/29/2024-16:34:06] [I] Export profile to JSON file: 
[01/29/2024-16:34:06] [I] 
[01/29/2024-16:34:06] [I] === Device Information ===
[01/29/2024-16:34:06] [I] Selected Device: NVIDIA RTX 6000 Ada Generation
[01/29/2024-16:34:06] [I] Compute Capability: 8.9
[01/29/2024-16:34:06] [I] SMs: 142
[01/29/2024-16:34:06] [I] Device Global Memory: 48624 MiB
[01/29/2024-16:34:06] [I] Shared Memory per SM: 100 KiB
[01/29/2024-16:34:06] [I] Memory Bus Width: 384 bits (ECC disabled)
[01/29/2024-16:34:06] [I] Application Compute Clock Rate: 2.505 GHz
[01/29/2024-16:34:06] [I] Application Memory Clock Rate: 10.001 GHz
[01/29/2024-16:34:06] [I] 
[01/29/2024-16:34:06] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[01/29/2024-16:34:06] [I] 
[01/29/2024-16:34:06] [I] TensorRT version: 8.6.1
[01/29/2024-16:34:06] [I] Loading standard plugins
[01/29/2024-16:34:07] [I] [TRT] [MemUsageChange] Init CUDA: CPU +1, GPU +0, now: CPU 18, GPU 6429 (MiB)
[01/29/2024-16:34:11] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1445, GPU +268, now: CPU 1540, GPU 6677 (MiB)
[01/29/2024-16:34:11] [I] Start parsing network model.
[01/29/2024-16:34:11] [I] [TRT] ----------------------------------------------------------------
[01/29/2024-16:34:11] [I] [TRT] Input filename:   pallet_model_v1_all.onnx
[01/29/2024-16:34:11] [I] [TRT] ONNX IR version:  0.0.7
[01/29/2024-16:34:11] [I] [TRT] Opset version:    14
[01/29/2024-16:34:11] [I] [TRT] Producer name:    pytorch
[01/29/2024-16:34:11] [I] [TRT] Producer version: 2.0.0
[01/29/2024-16:34:11] [I] [TRT] Domain:           
[01/29/2024-16:34:11] [I] [TRT] Model version:    0
[01/29/2024-16:34:11] [I] [TRT] Doc string:       
[01/29/2024-16:34:11] [I] [TRT] ----------------------------------------------------------------
[01/29/2024-16:34:11] [I] Finished parsing network model. Parse time: 0.156648
[01/29/2024-16:34:11] [I] [TRT] Graph optimization time: 0.0394493 seconds.
[01/29/2024-16:34:11] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/29/2024-16:34:30] [I] [TRT] Detected 1 inputs and 2 output network tensors.
[01/29/2024-16:34:30] [I] [TRT] Total Host Persistent Memory: 337808
[01/29/2024-16:34:30] [I] [TRT] Total Device Persistent Memory: 0
[01/29/2024-16:34:30] [I] [TRT] Total Scratch Memory: 0
[01/29/2024-16:34:30] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 55 MiB, GPU 576 MiB
[01/29/2024-16:34:30] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 86 steps to complete.
[01/29/2024-16:34:30] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 1.32485ms to assign 8 blocks to 86 nodes requiring 450625536 bytes.
[01/29/2024-16:34:30] [I] [TRT] Total Activation Memory: 450625536
[01/29/2024-16:34:30] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[01/29/2024-16:34:30] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[01/29/2024-16:34:30] [W] [TRT] Check verbose logs for the list of affected weights.
[01/29/2024-16:34:30] [W] [TRT] - 63 weights are affected by this issue: Detected subnormal FP16 values.
[01/29/2024-16:34:30] [W] [TRT] - 27 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
[01/29/2024-16:34:30] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +54, GPU +55, now: CPU 54, GPU 55 (MiB)
[01/29/2024-16:34:30] [I] Engine built in 23.3684 sec.
[01/29/2024-16:34:30] [I] [TRT] Loaded engine size: 55 MiB
[01/29/2024-16:34:30] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +54, now: CPU 0, GPU 54 (MiB)
[01/29/2024-16:34:30] [I] Engine deserialized in 0.0110032 sec.
[01/29/2024-16:34:30] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +430, now: CPU 0, GPU 484 (MiB)
[01/29/2024-16:34:30] [I] Setting persistentCacheLimit to 0 bytes.
[01/29/2024-16:34:30] [I] Using random values for input input
[01/29/2024-16:34:30] [I] Input binding for input with dimensions 1x3x256x256 is created.
[01/29/2024-16:34:30] [I] Output binding for heatmap with dimensions 1x1x256x256 is created.
[01/29/2024-16:34:30] [I] Output binding for vectormap with dimensions 1x16x256x256 is created.
[01/29/2024-16:34:30] [I] Starting inference
[01/29/2024-16:34:33] [I] Warmup completed 355 queries over 200 ms
[01/29/2024-16:34:33] [I] Timing trace has 5429 queries over 3.00161 s
[01/29/2024-16:34:33] [I] 
[01/29/2024-16:34:33] [I] === Trace details ===
[01/29/2024-16:34:33] [I] Trace averages of 10 runs:
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547552 ms - Host latency: 0.757536 ms (enqueue 0.187546 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548146 ms - Host latency: 0.7582 ms (enqueue 0.184897 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547354 ms - Host latency: 0.75753 ms (enqueue 0.1847 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54774 ms - Host latency: 0.757652 ms (enqueue 0.199054 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547131 ms - Host latency: 0.75705 ms (enqueue 0.193877 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547876 ms - Host latency: 0.75798 ms (enqueue 0.182571 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547455 ms - Host latency: 0.757665 ms (enqueue 0.183646 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547844 ms - Host latency: 0.758112 ms (enqueue 0.191174 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547565 ms - Host latency: 0.758234 ms (enqueue 0.205444 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547774 ms - Host latency: 0.7578 ms (enqueue 0.183731 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547949 ms - Host latency: 0.75784 ms (enqueue 0.186914 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547552 ms - Host latency: 0.757632 ms (enqueue 0.188785 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547556 ms - Host latency: 0.757397 ms (enqueue 0.198514 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54754 ms - Host latency: 0.757153 ms (enqueue 0.197452 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547665 ms - Host latency: 0.757666 ms (enqueue 0.189517 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547324 ms - Host latency: 0.757126 ms (enqueue 0.182126 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547748 ms - Host latency: 0.757593 ms (enqueue 0.19024 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547772 ms - Host latency: 0.757434 ms (enqueue 0.195593 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547757 ms - Host latency: 0.757861 ms (enqueue 0.19509 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547946 ms - Host latency: 0.757706 ms (enqueue 0.197748 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547736 ms - Host latency: 0.758508 ms (enqueue 0.196561 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547656 ms - Host latency: 0.757281 ms (enqueue 0.196204 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547989 ms - Host latency: 0.760202 ms (enqueue 0.264929 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547665 ms - Host latency: 0.757971 ms (enqueue 0.212656 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547684 ms - Host latency: 0.757532 ms (enqueue 0.19902 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54765 ms - Host latency: 0.757883 ms (enqueue 0.197873 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547537 ms - Host latency: 0.757294 ms (enqueue 0.199374 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547986 ms - Host latency: 0.757687 ms (enqueue 0.199512 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548065 ms - Host latency: 0.758246 ms (enqueue 0.198218 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547363 ms - Host latency: 0.757019 ms (enqueue 0.198846 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547842 ms - Host latency: 0.757852 ms (enqueue 0.19913 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547757 ms - Host latency: 0.757581 ms (enqueue 0.199203 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547861 ms - Host latency: 0.758038 ms (enqueue 0.199722 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54747 ms - Host latency: 0.757187 ms (enqueue 0.194519 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.577161 ms - Host latency: 0.787268 ms (enqueue 0.189795 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.623227 ms - Host latency: 0.839078 ms (enqueue 0.297061 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547943 ms - Host latency: 0.757877 ms (enqueue 0.227328 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547641 ms - Host latency: 0.757196 ms (enqueue 0.233789 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547989 ms - Host latency: 0.758792 ms (enqueue 0.238837 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547455 ms - Host latency: 0.757523 ms (enqueue 0.23381 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54776 ms - Host latency: 0.757559 ms (enqueue 0.228494 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547949 ms - Host latency: 0.758142 ms (enqueue 0.227197 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54776 ms - Host latency: 0.757794 ms (enqueue 0.22504 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547372 ms - Host latency: 0.757223 ms (enqueue 0.2233 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547488 ms - Host latency: 0.757367 ms (enqueue 0.2228 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547632 ms - Host latency: 0.757449 ms (enqueue 0.223288 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54744 ms - Host latency: 0.757471 ms (enqueue 0.224109 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547641 ms - Host latency: 0.757263 ms (enqueue 0.224704 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547635 ms - Host latency: 0.757568 ms (enqueue 0.225565 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.569369 ms - Host latency: 0.781287 ms (enqueue 0.343976 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547635 ms - Host latency: 0.757867 ms (enqueue 0.242303 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547272 ms - Host latency: 0.757162 ms (enqueue 0.194019 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547955 ms - Host latency: 0.757901 ms (enqueue 0.199579 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547647 ms - Host latency: 0.757571 ms (enqueue 0.194583 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.57037 ms - Host latency: 0.783933 ms (enqueue 0.314404 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54827 ms - Host latency: 0.758276 ms (enqueue 0.23847 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547711 ms - Host latency: 0.757532 ms (enqueue 0.226288 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.552771 ms - Host latency: 0.766254 ms (enqueue 0.392047 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547638 ms - Host latency: 0.757404 ms (enqueue 0.226453 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547223 ms - Host latency: 0.757196 ms (enqueue 0.224908 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547852 ms - Host latency: 0.757819 ms (enqueue 0.225287 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.589569 ms - Host latency: 0.804449 ms (enqueue 0.437915 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547571 ms - Host latency: 0.757434 ms (enqueue 0.202399 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547998 ms - Host latency: 0.757782 ms (enqueue 0.196265 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548041 ms - Host latency: 0.757709 ms (enqueue 0.190485 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547729 ms - Host latency: 0.757947 ms (enqueue 0.201123 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547662 ms - Host latency: 0.757806 ms (enqueue 0.209058 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548022 ms - Host latency: 0.75766 ms (enqueue 0.202356 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548041 ms - Host latency: 0.757837 ms (enqueue 0.201343 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547864 ms - Host latency: 0.757733 ms (enqueue 0.203839 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547662 ms - Host latency: 0.757587 ms (enqueue 0.202478 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547534 ms - Host latency: 0.75733 ms (enqueue 0.203149 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.54765 ms - Host latency: 0.757617 ms (enqueue 0.205286 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547559 ms - Host latency: 0.757373 ms (enqueue 0.200842 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547754 ms - Host latency: 0.757495 ms (enqueue 0.203967 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547644 ms - Host latency: 0.757349 ms (enqueue 0.201031 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547559 ms - Host latency: 0.75741 ms (enqueue 0.191016 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547736 ms - Host latency: 0.757684 ms (enqueue 0.20318 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547986 ms - Host latency: 0.757861 ms (enqueue 0.198883 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547955 ms - Host latency: 0.757941 ms (enqueue 0.189685 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547571 ms - Host latency: 0.757434 ms (enqueue 0.191522 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547742 ms - Host latency: 0.75752 ms (enqueue 0.19801 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547668 ms - Host latency: 0.757526 ms (enqueue 0.193933 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547223 ms - Host latency: 0.75683 ms (enqueue 0.190552 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547418 ms - Host latency: 0.757465 ms (enqueue 0.200928 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547742 ms - Host latency: 0.757587 ms (enqueue 0.199487 ms)
etc
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547803 ms - Host latency: 0.757349 ms (enqueue 0.225757 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548047 ms - Host latency: 0.758594 ms (enqueue 0.225464 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.548071 ms - Host latency: 0.75769 ms (enqueue 0.226978 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547852 ms - Host latency: 0.757812 ms (enqueue 0.226978 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547925 ms - Host latency: 0.758008 ms (enqueue 0.225684 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547827 ms - Host latency: 0.757837 ms (enqueue 0.22688 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547876 ms - Host latency: 0.758398 ms (enqueue 0.225586 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.584546 ms - Host latency: 0.799658 ms (enqueue 0.343042 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547461 ms - Host latency: 0.757153 ms (enqueue 0.226416 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547705 ms - Host latency: 0.758008 ms (enqueue 0.233301 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.546655 ms - Host latency: 0.756714 ms (enqueue 0.316675 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547778 ms - Host latency: 0.757642 ms (enqueue 0.225806 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.546997 ms - Host latency: 0.756665 ms (enqueue 0.227246 ms)
[01/29/2024-16:34:33] [I] Average on 10 runs - GPU latency: 0.547607 ms - Host latency: 0.757593 ms (enqueue 0.226562 ms)
[01/29/2024-16:34:33] [I] 
[01/29/2024-16:34:33] [I] === Performance summary ===
[01/29/2024-16:34:33] [I] Throughput: 1808.7 qps
[01/29/2024-16:34:33] [I] Latency: min = 0.754883 ms, max = 1.15161 ms, mean = 0.760292 ms, median = 0.757523 ms, percentile(90%) = 0.759277 ms, percentile(95%) = 0.759949 ms, percentile(99%) = 0.891968 ms
[01/29/2024-16:34:33] [I] Enqueue Time: min = 0.176758 ms, max = 1.13379 ms, mean = 0.210329 ms, median = 0.196289 ms, percentile(90%) = 0.228027 ms, percentile(95%) = 0.23645 ms, percentile(99%) = 0.698364 ms
[01/29/2024-16:34:33] [I] H2D Latency: min = 0.0349121 ms, max = 0.0947266 ms, mean = 0.0357133 ms, median = 0.0355225 ms, percentile(90%) = 0.0356445 ms, percentile(95%) = 0.0357666 ms, percentile(99%) = 0.0422363 ms
[01/29/2024-16:34:33] [I] GPU Compute Time: min = 0.545654 ms, max = 0.899414 ms, mean = 0.550035 ms, median = 0.547852 ms, percentile(90%) = 0.548859 ms, percentile(95%) = 0.549072 ms, percentile(99%) = 0.666626 ms
[01/29/2024-16:34:33] [I] D2H Latency: min = 0.173096 ms, max = 0.210083 ms, mean = 0.174537 ms, median = 0.174072 ms, percentile(90%) = 0.175537 ms, percentile(95%) = 0.175781 ms, percentile(99%) = 0.180664 ms
[01/29/2024-16:34:33] [I] Total Host Walltime: 3.00161 s
[01/29/2024-16:34:33] [I] Total GPU Compute Time: 2.98614 s
[01/29/2024-16:34:33] [W] * GPU compute time is unstable, with coefficient of variance = 3.64324%.
[01/29/2024-16:34:33] [W]   If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
[01/29/2024-16:34:33] [I] Explanations of the performance metrics are printed in the verbose logs.
[01/29/2024-16:34:33] [I] 
&&&& PASSED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=pallet_model_v1_all.onnx --minShapes=input:1x3x192x192 --maxShapes=input:1x3x1536x1536 --optShapes=input:1x3x256x256 --saveEngine=pallet_model_v1_all.engine --fp16

then when I use the engine, I get this error

root@62abf45c3b89:/workspace# python test.py 
[01/29/2024-21:36:52] [TRT] [E] 6: The engine plan file is not compatible with this version of TensorRT, expecting library version 8.6.1.2 got 8.6.1.6, please rebuild.
[01/29/2024-21:36:52] [TRT] [E] 2: [engine.cpp::deserializeEngine::951] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'
'NoneType' object has no attribute 'get_binding_index'

Could you please help how to fix it? Thank you

jaybdub commented 8 months ago

Hi @monajalal ,

Judging from the error it looks like the engine might have been built with a different version of TensorRT than it's being executed with.

If this is the case, could you try re-building the TensorRT engine in the environment that you're running the engine from?

Hope this helps, let me know if it works for you or you run into issues / have questions.

Best, John

monajalal commented 8 months ago

My bad I was recreating the engine by mistake from outside the docker. The problem that I emailed you is an isolated problem and still persists.