I've tested the imagenet on the Jetson Orin Nano Dev Kit but it doesn't work correctly.
The log shows below. I think it is a TensorRT error. Do you have any idea?
$ imagenet data/images/orange_0.jpg data/images/test/output_0.jpg
[video] created imageLoader from file:///home/youtalk/src/jetson-inference/data/images/orange_0.jpg
------------------------------------------------
imageLoader video options:
------------------------------------------------
-- URI: file:///home/youtalk/src/jetson-inference/data/images/orange_0.jpg
- protocol: file
- location: data/images/orange_0.jpg
- extension: jpg
-- deviceType: file
-- ioType: input
-- codec: unknown
-- codecType: v4l2
-- frameRate: 0
-- numBuffers: 4
-- zeroCopy: true
-- flipMethod: none
-- loop: 0
------------------------------------------------
[video] created imageWriter from file:///home/youtalk/src/jetson-inference/data/images/test/output_0.jpg
------------------------------------------------
imageWriter video options:
------------------------------------------------
-- URI: file:///home/youtalk/src/jetson-inference/data/images/test/output_0.jpg
- protocol: file
- location: data/images/test/output_0.jpg
- extension: jpg
-- deviceType: file
-- ioType: output
-- codec: unknown
-- codecType: v4l2
-- frameRate: 0
-- bitRate: 0
-- numBuffers: 4
-- zeroCopy: true
------------------------------------------------
[OpenGL] glDisplay -- X screen 0 resolution: 1920x1080
[OpenGL] glDisplay -- X window resolution: 1920x1080
[OpenGL] glDisplay -- display device initialized (1920x1080)
[video] created glDisplay from display://0
------------------------------------------------
glDisplay video options:
------------------------------------------------
-- URI: display://0
- protocol: display
- location: 0
-- deviceType: display
-- ioType: output
-- width: 1920
-- height: 1080
-- frameRate: 0
-- numBuffers: 4
-- zeroCopy: true
------------------------------------------------
imageNet -- loading classification network model from:
-- prototxt networks/Googlenet/googlenet.prototxt
-- model networks/Googlenet/bvlc_googlenet.caffemodel
-- class_labels networks/ilsvrc12_synset_words.txt
-- input_blob 'data'
-- output_blob 'prob'
-- batch_size 1
[TRT] TensorRT version 8.5.2
[TRT] loading NVIDIA plugins...
[TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::CoordConvAC version 1
[TRT] Registered plugin creator - ::CropAndResizeDynamic version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DecodeBbox3DPlugin version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::GenerateDetection_TRT version 1
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[TRT] Registered plugin creator - ::GroupNorm version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 2
[TRT] Registered plugin creator - ::LayerNorm version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
[TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
[TRT] Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1
[TRT] Registered plugin creator - ::NMSDynamic_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::PillarScatterPlugin version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::ProposalDynamic version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::ROIAlign_TRT version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::ScatterND version 1
[TRT] Registered plugin creator - ::SeqLen2Spatial version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::SplitGeLU version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::VoxelGeneratorPlugin version 1
[TRT] detected model format - caffe (extension '.caffemodel')
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] [MemUsageChange] Init CUDA: CPU +215, GPU +0, now: CPU 258, GPU 2773 (MiB)
[TRT] Trying to load shared library libnvinfer_builder_resource.so.8.5.2
[TRT] Loaded shared library libnvinfer_builder_resource.so.8.5.2
[TRT] [MemUsageChange] Init builder kernel library: CPU +302, GPU +430, now: CPU 582, GPU 3225 (MiB)
[TRT] native precisions detected for GPU: FP32, FP16, INT8
[TRT] selecting fastest native precision for GPU: FP16
[TRT] could not find engine cache /usr/local/bin/networks/Googlenet/bvlc_googlenet.caffemodel.1.1.8502.GPU.FP16.engine
[TRT] cache file invalid, profiling network model on device GPU
[TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 288, GPU 3226 (MiB)
[TRT] Trying to load shared library libnvinfer_builder_resource.so.8.5.2
[TRT] Loaded shared library libnvinfer_builder_resource.so.8.5.2
[TRT] [MemUsageChange] Init builder kernel library: CPU +295, GPU +27, now: CPU 583, GPU 3260 (MiB)
[TRT] The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
[TRT] device GPU, loading /usr/local/bin/networks/Googlenet/googlenet.prototxt /usr/local/bin/networks/Googlenet/bvlc_googlenet.caffemodel
[TRT] device GPU, configuring network builder
[TRT] device GPU, building FP16: ON
[TRT] device GPU, building INT8: OFF
[TRT] device GPU, workspace size: 33554432
[TRT] device GPU, building CUDA engine (this may take a few minutes the first time a network is loaded)
[TRT] Original: 141 layers
[TRT] After dead-layer removal: 141 layers
[TRT] Applying generic optimizations to the graph for inference.
[TRT] Running: FCToConvTransform on loss3/classifier
[TRT] Convert layer type of loss3/classifier from FULLY_CONNECTED to CONVOLUTION
[TRT] Running: ShuffleErasure on shuffle_between_pool5/7x7_s1_and_loss3/classifier
[TRT] Removing shuffle_between_pool5/7x7_s1_and_loss3/classifier
[TRT] Applying ScaleNodes fusions.
[TRT] After scale fusion: 141 layers
[TRT] Running: ConvReluFusion on conv1/7x7_s2
[TRT] ConvReluFusion: Fusing conv1/7x7_s2 with conv1/relu_7x7
[TRT] Running: ConvReluFusion on conv2/3x3_reduce
[TRT] ConvReluFusion: Fusing conv2/3x3_reduce with conv2/relu_3x3_reduce
[TRT] Running: ConvReluFusion on conv2/3x3
[TRT] ConvReluFusion: Fusing conv2/3x3 with conv2/relu_3x3
[TRT] Running: ConvReluFusion on inception_3a/1x1
[TRT] ConvReluFusion: Fusing inception_3a/1x1 with inception_3a/relu_1x1
[TRT] Running: ConvReluFusion on inception_3a/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_3a/3x3_reduce with inception_3a/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_3a/3x3
[TRT] ConvReluFusion: Fusing inception_3a/3x3 with inception_3a/relu_3x3
[TRT] Running: ConvReluFusion on inception_3a/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_3a/5x5_reduce with inception_3a/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_3a/5x5
[TRT] ConvReluFusion: Fusing inception_3a/5x5 with inception_3a/relu_5x5
[TRT] Running: ConvReluFusion on inception_3a/pool_proj
[TRT] ConvReluFusion: Fusing inception_3a/pool_proj with inception_3a/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_3b/1x1
[TRT] ConvReluFusion: Fusing inception_3b/1x1 with inception_3b/relu_1x1
[TRT] Running: ConvReluFusion on inception_3b/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_3b/3x3_reduce with inception_3b/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_3b/3x3
[TRT] ConvReluFusion: Fusing inception_3b/3x3 with inception_3b/relu_3x3
[TRT] Running: ConvReluFusion on inception_3b/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_3b/5x5_reduce with inception_3b/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_3b/5x5
[TRT] ConvReluFusion: Fusing inception_3b/5x5 with inception_3b/relu_5x5
[TRT] Running: ConvReluFusion on inception_3b/pool_proj
[TRT] ConvReluFusion: Fusing inception_3b/pool_proj with inception_3b/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_4a/1x1
[TRT] ConvReluFusion: Fusing inception_4a/1x1 with inception_4a/relu_1x1
[TRT] Running: ConvReluFusion on inception_4a/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_4a/3x3_reduce with inception_4a/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_4a/3x3
[TRT] ConvReluFusion: Fusing inception_4a/3x3 with inception_4a/relu_3x3
[TRT] Running: ConvReluFusion on inception_4a/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_4a/5x5_reduce with inception_4a/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_4a/5x5
[TRT] ConvReluFusion: Fusing inception_4a/5x5 with inception_4a/relu_5x5
[TRT] Running: ConvReluFusion on inception_4a/pool_proj
[TRT] ConvReluFusion: Fusing inception_4a/pool_proj with inception_4a/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_4b/1x1
[TRT] ConvReluFusion: Fusing inception_4b/1x1 with inception_4b/relu_1x1
[TRT] Running: ConvReluFusion on inception_4b/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_4b/3x3_reduce with inception_4b/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_4b/3x3
[TRT] ConvReluFusion: Fusing inception_4b/3x3 with inception_4b/relu_3x3
[TRT] Running: ConvReluFusion on inception_4b/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_4b/5x5_reduce with inception_4b/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_4b/5x5
[TRT] ConvReluFusion: Fusing inception_4b/5x5 with inception_4b/relu_5x5
[TRT] Running: ConvReluFusion on inception_4b/pool_proj
[TRT] ConvReluFusion: Fusing inception_4b/pool_proj with inception_4b/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_4c/1x1
[TRT] ConvReluFusion: Fusing inception_4c/1x1 with inception_4c/relu_1x1
[TRT] Running: ConvReluFusion on inception_4c/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_4c/3x3_reduce with inception_4c/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_4c/3x3
[TRT] ConvReluFusion: Fusing inception_4c/3x3 with inception_4c/relu_3x3
[TRT] Running: ConvReluFusion on inception_4c/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_4c/5x5_reduce with inception_4c/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_4c/5x5
[TRT] ConvReluFusion: Fusing inception_4c/5x5 with inception_4c/relu_5x5
[TRT] Running: ConvReluFusion on inception_4c/pool_proj
[TRT] ConvReluFusion: Fusing inception_4c/pool_proj with inception_4c/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_4d/1x1
[TRT] ConvReluFusion: Fusing inception_4d/1x1 with inception_4d/relu_1x1
[TRT] Running: ConvReluFusion on inception_4d/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_4d/3x3_reduce with inception_4d/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_4d/3x3
[TRT] ConvReluFusion: Fusing inception_4d/3x3 with inception_4d/relu_3x3
[TRT] Running: ConvReluFusion on inception_4d/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_4d/5x5_reduce with inception_4d/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_4d/5x5
[TRT] ConvReluFusion: Fusing inception_4d/5x5 with inception_4d/relu_5x5
[TRT] Running: ConvReluFusion on inception_4d/pool_proj
[TRT] ConvReluFusion: Fusing inception_4d/pool_proj with inception_4d/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_4e/1x1
[TRT] ConvReluFusion: Fusing inception_4e/1x1 with inception_4e/relu_1x1
[TRT] Running: ConvReluFusion on inception_4e/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_4e/3x3_reduce with inception_4e/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_4e/3x3
[TRT] ConvReluFusion: Fusing inception_4e/3x3 with inception_4e/relu_3x3
[TRT] Running: ConvReluFusion on inception_4e/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_4e/5x5_reduce with inception_4e/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_4e/5x5
[TRT] ConvReluFusion: Fusing inception_4e/5x5 with inception_4e/relu_5x5
[TRT] Running: ConvReluFusion on inception_4e/pool_proj
[TRT] ConvReluFusion: Fusing inception_4e/pool_proj with inception_4e/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_5a/1x1
[TRT] ConvReluFusion: Fusing inception_5a/1x1 with inception_5a/relu_1x1
[TRT] Running: ConvReluFusion on inception_5a/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_5a/3x3_reduce with inception_5a/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_5a/3x3
[TRT] ConvReluFusion: Fusing inception_5a/3x3 with inception_5a/relu_3x3
[TRT] Running: ConvReluFusion on inception_5a/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_5a/5x5_reduce with inception_5a/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_5a/5x5
[TRT] ConvReluFusion: Fusing inception_5a/5x5 with inception_5a/relu_5x5
[TRT] Running: ConvReluFusion on inception_5a/pool_proj
[TRT] ConvReluFusion: Fusing inception_5a/pool_proj with inception_5a/relu_pool_proj
[TRT] Running: ConvReluFusion on inception_5b/1x1
[TRT] ConvReluFusion: Fusing inception_5b/1x1 with inception_5b/relu_1x1
[TRT] Running: ConvReluFusion on inception_5b/3x3_reduce
[TRT] ConvReluFusion: Fusing inception_5b/3x3_reduce with inception_5b/relu_3x3_reduce
[TRT] Running: ConvReluFusion on inception_5b/3x3
[TRT] ConvReluFusion: Fusing inception_5b/3x3 with inception_5b/relu_3x3
[TRT] Running: ConvReluFusion on inception_5b/5x5_reduce
[TRT] ConvReluFusion: Fusing inception_5b/5x5_reduce with inception_5b/relu_5x5_reduce
[TRT] Running: ConvReluFusion on inception_5b/5x5
[TRT] ConvReluFusion: Fusing inception_5b/5x5 with inception_5b/relu_5x5
[TRT] Running: ConvReluFusion on inception_5b/pool_proj
[TRT] ConvReluFusion: Fusing inception_5b/pool_proj with inception_5b/relu_pool_proj
[TRT] After dupe layer removal: 84 layers
[TRT] After final dead-layer removal: 84 layers
[TRT] After tensor merging: 84 layers
[TRT] After vertical fusions: 84 layers
[TRT] After dupe layer removal: 84 layers
[TRT] After final dead-layer removal: 84 layers
[TRT] Merging layers: inception_3a/1x1 + inception_3a/relu_1x1 || inception_3a/3x3_reduce + inception_3a/relu_3x3_reduce || inception_3a/5x5_reduce + inception_3a/relu_5x5_reduce
[TRT] Merging layers: inception_3b/1x1 + inception_3b/relu_1x1 || inception_3b/3x3_reduce + inception_3b/relu_3x3_reduce || inception_3b/5x5_reduce + inception_3b/relu_5x5_reduce
[TRT] Merging layers: inception_4a/1x1 + inception_4a/relu_1x1 || inception_4a/3x3_reduce + inception_4a/relu_3x3_reduce || inception_4a/5x5_reduce + inception_4a/relu_5x5_reduce
[TRT] Merging layers: inception_4b/1x1 + inception_4b/relu_1x1 || inception_4b/3x3_reduce + inception_4b/relu_3x3_reduce || inception_4b/5x5_reduce + inception_4b/relu_5x5_reduce
[TRT] Merging layers: inception_4c/1x1 + inception_4c/relu_1x1 || inception_4c/3x3_reduce + inception_4c/relu_3x3_reduce || inception_4c/5x5_reduce + inception_4c/relu_5x5_reduce
[TRT] Merging layers: inception_4d/5x5_reduce + inception_4d/relu_5x5_reduce || inception_4d/1x1 + inception_4d/relu_1x1 || inception_4d/3x3_reduce + inception_4d/relu_3x3_reduce
[TRT] Merging layers: inception_4e/1x1 + inception_4e/relu_1x1 || inception_4e/3x3_reduce + inception_4e/relu_3x3_reduce || inception_4e/5x5_reduce + inception_4e/relu_5x5_reduce
[TRT] Merging layers: inception_5a/1x1 + inception_5a/relu_1x1 || inception_5a/3x3_reduce + inception_5a/relu_3x3_reduce || inception_5a/5x5_reduce + inception_5a/relu_5x5_reduce
[TRT] Merging layers: inception_5b/1x1 + inception_5b/relu_1x1 || inception_5b/3x3_reduce + inception_5b/relu_3x3_reduce || inception_5b/5x5_reduce + inception_5b/relu_5x5_reduce
[TRT] After tensor merging: 66 layers
[TRT] After slice removal: 66 layers
[TRT] Eliminating concatenation inception_5b/output
[TRT] Generating copy for inception_5b/1x1 + inception_5b/relu_1x1 || inception_5b/3x3_reduce + inception_5b/relu_3x3_reduce || inception_5b/5x5_reduce + inception_5b/relu_5x5_reduce to inception_5b/output because input is not movable.
[TRT] Retargeting inception_5b/3x3 to inception_5b/output
[TRT] Retargeting inception_5b/5x5 to inception_5b/output
[TRT] Retargeting inception_5b/pool_proj to inception_5b/output
[TRT] Eliminating concatenation inception_5a/output
[TRT] Generating copy for inception_5a/1x1 + inception_5a/relu_1x1 || inception_5a/3x3_reduce + inception_5a/relu_3x3_reduce || inception_5a/5x5_reduce + inception_5a/relu_5x5_reduce to inception_5a/output because input is not movable.
[TRT] Retargeting inception_5a/3x3 to inception_5a/output
[TRT] Retargeting inception_5a/5x5 to inception_5a/output
[TRT] Retargeting inception_5a/pool_proj to inception_5a/output
[TRT] Eliminating concatenation inception_4e/output
[TRT] Generating copy for inception_4e/1x1 + inception_4e/relu_1x1 || inception_4e/3x3_reduce + inception_4e/relu_3x3_reduce || inception_4e/5x5_reduce + inception_4e/relu_5x5_reduce to inception_4e/output because input is not movable.
[TRT] Retargeting inception_4e/3x3 to inception_4e/output
[TRT] Retargeting inception_4e/5x5 to inception_4e/output
[TRT] Retargeting inception_4e/pool_proj to inception_4e/output
[TRT] Eliminating concatenation inception_4d/output
[TRT] Generating copy for inception_4d/5x5_reduce + inception_4d/relu_5x5_reduce || inception_4d/1x1 + inception_4d/relu_1x1 || inception_4d/3x3_reduce + inception_4d/relu_3x3_reduce to inception_4d/output because input is not movable.
[TRT] Retargeting inception_4d/3x3 to inception_4d/output
[TRT] Retargeting inception_4d/5x5 to inception_4d/output
[TRT] Retargeting inception_4d/pool_proj to inception_4d/output
[TRT] Eliminating concatenation inception_4c/output
[TRT] Generating copy for inception_4c/1x1 + inception_4c/relu_1x1 || inception_4c/3x3_reduce + inception_4c/relu_3x3_reduce || inception_4c/5x5_reduce + inception_4c/relu_5x5_reduce to inception_4c/output because input is not movable.
[TRT] Retargeting inception_4c/3x3 to inception_4c/output
[TRT] Retargeting inception_4c/5x5 to inception_4c/output
[TRT] Retargeting inception_4c/pool_proj to inception_4c/output
[TRT] Eliminating concatenation inception_4b/output
[TRT] Generating copy for inception_4b/1x1 + inception_4b/relu_1x1 || inception_4b/3x3_reduce + inception_4b/relu_3x3_reduce || inception_4b/5x5_reduce + inception_4b/relu_5x5_reduce to inception_4b/output because input is not movable.
[TRT] Retargeting inception_4b/3x3 to inception_4b/output
[TRT] Retargeting inception_4b/5x5 to inception_4b/output
[TRT] Retargeting inception_4b/pool_proj to inception_4b/output
[TRT] Eliminating concatenation inception_4a/output
[TRT] Generating copy for inception_4a/1x1 + inception_4a/relu_1x1 || inception_4a/3x3_reduce + inception_4a/relu_3x3_reduce || inception_4a/5x5_reduce + inception_4a/relu_5x5_reduce to inception_4a/output because input is not movable.
[TRT] Retargeting inception_4a/3x3 to inception_4a/output
[TRT] Retargeting inception_4a/5x5 to inception_4a/output
[TRT] Retargeting inception_4a/pool_proj to inception_4a/output
[TRT] Eliminating concatenation inception_3b/output
[TRT] Generating copy for inception_3b/1x1 + inception_3b/relu_1x1 || inception_3b/3x3_reduce + inception_3b/relu_3x3_reduce || inception_3b/5x5_reduce + inception_3b/relu_5x5_reduce to inception_3b/output because input is not movable.
[TRT] Retargeting inception_3b/3x3 to inception_3b/output
[TRT] Retargeting inception_3b/5x5 to inception_3b/output
[TRT] Retargeting inception_3b/pool_proj to inception_3b/output
[TRT] Eliminating concatenation inception_3a/output
[TRT] Generating copy for inception_3a/1x1 + inception_3a/relu_1x1 || inception_3a/3x3_reduce + inception_3a/relu_3x3_reduce || inception_3a/5x5_reduce + inception_3a/relu_5x5_reduce to inception_3a/output because input is not movable.
[TRT] Retargeting inception_3a/3x3 to inception_3a/output
[TRT] Retargeting inception_3a/5x5 to inception_3a/output
[TRT] Retargeting inception_3a/pool_proj to inception_3a/output
[TRT] After concat removal: 66 layers
[TRT] Trying to split Reshape and strided tensor
[TRT] Graph construction and optimization completed in 0.0584452 seconds.
[TRT] ---------- Layers Running on DLA ----------
[TRT] ---------- Layers Running on GPU ----------
[TRT] [GpuLayer] CONVOLUTION: conv1/7x7_s2 + conv1/relu_7x7
[TRT] [GpuLayer] POOLING: pool1/3x3_s2
[TRT] [GpuLayer] LRN: pool1/norm1
[TRT] [GpuLayer] CONVOLUTION: conv2/3x3_reduce + conv2/relu_3x3_reduce
[TRT] [GpuLayer] CONVOLUTION: conv2/3x3 + conv2/relu_3x3
[TRT] [GpuLayer] LRN: conv2/norm2
[TRT] [GpuLayer] POOLING: pool2/3x3_s2
[TRT] [GpuLayer] CONVOLUTION: inception_3a/1x1 + inception_3a/relu_1x1 || inception_3a/3x3_reduce + inception_3a/relu_3x3_reduce || inception_3a/5x5_reduce + inception_3a/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_3a/3x3 + inception_3a/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_3a/5x5 + inception_3a/relu_5x5
[TRT] [GpuLayer] POOLING: inception_3a/pool
[TRT] [GpuLayer] CONVOLUTION: inception_3a/pool_proj + inception_3a/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_3a/1x1 copy
[TRT] [GpuLayer] CONVOLUTION: inception_3b/1x1 + inception_3b/relu_1x1 || inception_3b/3x3_reduce + inception_3b/relu_3x3_reduce || inception_3b/5x5_reduce + inception_3b/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_3b/3x3 + inception_3b/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_3b/5x5 + inception_3b/relu_5x5
[TRT] [GpuLayer] POOLING: inception_3b/pool
[TRT] [GpuLayer] CONVOLUTION: inception_3b/pool_proj + inception_3b/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_3b/1x1 copy
[TRT] [GpuLayer] POOLING: pool3/3x3_s2
[TRT] [GpuLayer] CONVOLUTION: inception_4a/1x1 + inception_4a/relu_1x1 || inception_4a/3x3_reduce + inception_4a/relu_3x3_reduce || inception_4a/5x5_reduce + inception_4a/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_4a/3x3 + inception_4a/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_4a/5x5 + inception_4a/relu_5x5
[TRT] [GpuLayer] POOLING: inception_4a/pool
[TRT] [GpuLayer] CONVOLUTION: inception_4a/pool_proj + inception_4a/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_4a/1x1 copy
[TRT] [GpuLayer] CONVOLUTION: inception_4b/1x1 + inception_4b/relu_1x1 || inception_4b/3x3_reduce + inception_4b/relu_3x3_reduce || inception_4b/5x5_reduce + inception_4b/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_4b/3x3 + inception_4b/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_4b/5x5 + inception_4b/relu_5x5
[TRT] [GpuLayer] POOLING: inception_4b/pool
[TRT] [GpuLayer] CONVOLUTION: inception_4b/pool_proj + inception_4b/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_4b/1x1 copy
[TRT] [GpuLayer] CONVOLUTION: inception_4c/1x1 + inception_4c/relu_1x1 || inception_4c/3x3_reduce + inception_4c/relu_3x3_reduce || inception_4c/5x5_reduce + inception_4c/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_4c/3x3 + inception_4c/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_4c/5x5 + inception_4c/relu_5x5
[TRT] [GpuLayer] POOLING: inception_4c/pool
[TRT] [GpuLayer] CONVOLUTION: inception_4c/pool_proj + inception_4c/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_4c/1x1 copy
[TRT] [GpuLayer] CONVOLUTION: inception_4d/5x5_reduce + inception_4d/relu_5x5_reduce || inception_4d/1x1 + inception_4d/relu_1x1 || inception_4d/3x3_reduce + inception_4d/relu_3x3_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_4d/3x3 + inception_4d/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_4d/5x5 + inception_4d/relu_5x5
[TRT] [GpuLayer] POOLING: inception_4d/pool
[TRT] [GpuLayer] CONVOLUTION: inception_4d/pool_proj + inception_4d/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_4d/1x1 copy
[TRT] [GpuLayer] CONVOLUTION: inception_4e/1x1 + inception_4e/relu_1x1 || inception_4e/3x3_reduce + inception_4e/relu_3x3_reduce || inception_4e/5x5_reduce + inception_4e/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_4e/3x3 + inception_4e/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_4e/5x5 + inception_4e/relu_5x5
[TRT] [GpuLayer] POOLING: inception_4e/pool
[TRT] [GpuLayer] CONVOLUTION: inception_4e/pool_proj + inception_4e/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_4e/1x1 copy
[TRT] [GpuLayer] POOLING: pool4/3x3_s2
[TRT] [GpuLayer] CONVOLUTION: inception_5a/1x1 + inception_5a/relu_1x1 || inception_5a/3x3_reduce + inception_5a/relu_3x3_reduce || inception_5a/5x5_reduce + inception_5a/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_5a/3x3 + inception_5a/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_5a/5x5 + inception_5a/relu_5x5
[TRT] [GpuLayer] POOLING: inception_5a/pool
[TRT] [GpuLayer] CONVOLUTION: inception_5a/pool_proj + inception_5a/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_5a/1x1 copy
[TRT] [GpuLayer] CONVOLUTION: inception_5b/1x1 + inception_5b/relu_1x1 || inception_5b/3x3_reduce + inception_5b/relu_3x3_reduce || inception_5b/5x5_reduce + inception_5b/relu_5x5_reduce
[TRT] [GpuLayer] CONVOLUTION: inception_5b/3x3 + inception_5b/relu_3x3
[TRT] [GpuLayer] CONVOLUTION: inception_5b/5x5 + inception_5b/relu_5x5
[TRT] [GpuLayer] POOLING: inception_5b/pool
[TRT] [GpuLayer] CONVOLUTION: inception_5b/pool_proj + inception_5b/relu_pool_proj
[TRT] [GpuLayer] COPY: inception_5b/1x1 copy
[TRT] [GpuLayer] POOLING: pool5/7x7_s1
[TRT] [GpuLayer] CONVOLUTION: loss3/classifier
[TRT] [GpuLayer] SOFTMAX: prob
[TRT] Trying to load shared library libcublas.so.11
[TRT] Loaded shared library libcublas.so.11
[TRT] Using cublas as plugin tactic source
[TRT] Trying to load shared library libcublasLt.so.11
[TRT] Loaded shared library libcublasLt.so.11
[TRT] Using cublasLt as core library tactic source
[TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +534, GPU +625, now: CPU 1187, GPU 3952 (MiB)
[TRT] Trying to load shared library libcudnn.so.8
[TRT] Loaded shared library libcudnn.so.8
[TRT] Using cuDNN as plugin tactic source
[TRT] Using cuDNN as core library tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +126, now: CPU 1269, GPU 4078 (MiB)
[TRT] Global timing cache in use. Profiling results in this builder pass will be stored.
[TRT] Constructing optimization profile number 0 [1/1].
[TRT] Reserving memory for host IO tensors. Host: 0 bytes
[TRT] =============== Computing reformatting costs:
[TRT] *************** Autotuning Reformat: Float(150528,50176,224,1) -> Float(150528,1,672,3) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(data -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0388984
[TRT] Tactic: 0x00000000000003ea Time: 0.0660305
[TRT] Tactic: 0x0000000000000000 Time: 0.0387326
[TRT] Fastest Tactic: 0x0000000000000000 Time: 0.0387326
[TRT] *************** Autotuning Reformat: Float(150528,50176,224,1) -> Float(50176,1:4,224,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(data -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0749105
[TRT] Tactic: 0x00000000000003ea Time: 0.0624887
[TRT] Tactic: 0x0000000000000000 Time: 0.0751331
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.0624887
[TRT] *************** Autotuning Reformat: Float(150528,50176,224,1) -> Half(150528,50176,224,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(data -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0132488
[TRT] Tactic: 0x00000000000003ea Time: 0.0373537
[TRT] Tactic: 0x0000000000000000 Time: 0.0486836
[TRT] Fastest Tactic: 0x00000000000003e8 Time: 0.0132488
[TRT] *************** Autotuning Reformat: Float(150528,50176,224,1) -> Half(100352,50176:2,224,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(data -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0521847
[TRT] Tactic: 0x00000000000003ea Time: 0.070464
[TRT] Tactic: 0x0000000000000000 Time: 0.0304039
[TRT] Fastest Tactic: 0x0000000000000000 Time: 0.0304039
[TRT] *************** Autotuning Reformat: Float(150528,50176,224,1) -> Half(50176,1:4,224,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(data -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0520742
[TRT] Tactic: 0x00000000000003ea Time: 0.0339927
[TRT] Tactic: 0x0000000000000000 Time: 0.0518298
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.0339927
[TRT] *************** Autotuning Reformat: Float(150528,50176,224,1) -> Half(50176,1:8,224,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(data -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0887927
[TRT] Tactic: 0x00000000000003ea Time: 0.0614298
[TRT] Tactic: 0x0000000000000000 Time: 0.0405828
[TRT] Fastest Tactic: 0x0000000000000000 Time: 0.0405828
[TRT] =============== Computing reformatting costs:
[TRT] *************** Autotuning Reformat: Float(802816,12544,112,1) -> Float(200704,1:4,1792,16) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.719087
[TRT] Tactic: 0x00000000000003ea Time: 0.133603
[TRT] Tactic: 0x0000000000000000 Time: 0.592122
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.133603
[TRT] *************** Autotuning Reformat: Float(802816,12544,112,1) -> Half(802816,12544,112,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.0912291
[TRT] Tactic: 0x00000000000003ea Time: 0.164381
[TRT] Tactic: 0x0000000000000000 Time: 0.259113
[TRT] Fastest Tactic: 0x00000000000003e8 Time: 0.0912291
[TRT] *************** Autotuning Reformat: Float(802816,12544,112,1) -> Half(401408,12544:2,112,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.317257
[TRT] Tactic: 0x00000000000003ea Time: 0.192198
[TRT] Tactic: 0x0000000000000000 Time: 0.139887
[TRT] Fastest Tactic: 0x0000000000000000 Time: 0.139887
[TRT] *************** Autotuning Reformat: Float(802816,12544,112,1) -> Half(100352,1:8,896,8) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.438359
[TRT] Tactic: 0x00000000000003ea Time: 0.122243
[TRT] Tactic: 0x0000000000000000 Time: 0.14494
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.122243
[TRT] *************** Autotuning Reformat: Float(802816,1,7168,64) -> Float(802816,12544,112,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.504669
[TRT] Tactic: 0x00000000000003ea Time: 0.153452
[TRT] Tactic: 0x0000000000000000 Time: 0.380396
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.153452
[TRT] *************** Autotuning Reformat: Float(802816,1,7168,64) -> Float(200704,1:4,1792,16) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.212239
[TRT] Tactic: 0x00000000000003ea Time: 0.135258
[TRT] Tactic: 0x0000000000000000 Time: 0.375668
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.135258
[TRT] *************** Autotuning Reformat: Float(802816,1,7168,64) -> Half(802816,12544,112,1) ***************
[TRT] --------------- Timing Runner: Optimizer Reformat(conv1/7x7_s2 -> <out>) (Reformat)
[TRT] Tactic: 0x00000000000003e8 Time: 0.519762
[TRT] Tactic: 0x00000000000003ea Time: 0.140538
[TRT] Tactic: 0x0000000000000000 Time: 0.37769
[TRT] Fastest Tactic: 0x00000000000003ea Time: 0.140538
...
I've tested the
imagenet
on the Jetson Orin Nano Dev Kit but it doesn't work correctly. The log shows below. I think it is a TensorRT error. Do you have any idea?