Closed c7934597 closed 2 years ago
Hi, edit nvinfer1::ICudaEngine *Yolo::createEngine (nvinfer1::IBuilder* builder)
function (lines 61-90), in yolo.cpp file, to:
nvinfer1::ICudaEngine *Yolo::createEngine (nvinfer1::IBuilder* builder)
{
assert (builder);
if (m_DeviceType == "kDLA") {
builder->setDefaultDeviceType(nvinfer1::DeviceType::kDLA);
}
std::vector<float> weights = loadWeights(m_WtsFilePath, m_NetworkType);
std::vector<nvinfer1::Weights> trtWeights;
nvinfer1::INetworkDefinition *network = builder->createNetwork();
if (parseModel(*network) != NVDSINFER_SUCCESS) {
network->destroy();
return nullptr;
}
// Build the engine
std::cout << "Building the TensorRT Engine" << std::endl;
nvinfer1::ICudaEngine * engine = builder->buildCudaEngine(*network);
if (engine) {
std::cout << "Building complete\n" << std::endl;
} else {
std::cerr << "Building engine failed\n" << std::endl;
}
// destroy
network->destroy();
return engine;
}
and add these lines in config_infer_primary.txt file (in [property] section):
enable-dla=1
use-dla-core=0
Note: edit these lines according to:
enable-dla: Indicates whether to use the DLA engine for inferencing. Boolean: 0 or 1
use-dla-core: DLA core to be used. Integer: ≥0
I don't have Xavier board to test. Please tell me if it works.
It can run on Xavier, but seems some layer convert to DLA unsuccessfully.
I use tegrastats and jtop for seeing effects. The effect is as same as original.
Building YOLO network complete Building the TensorRT Engine ERROR: [TRT]: leaky_1: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_1 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_2: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_2 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_3: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_3 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_3: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_3 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer 10) [Slice] is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_5: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_5 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_6: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_6 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_8: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_8 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_11: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_11 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_11: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_11 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer 27) [Slice] is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_13: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_13 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_14: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_14 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_16: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_16 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_19: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_19 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_19: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_19 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer 44) [Slice] is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_21: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_21 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_22: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_22 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_24: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_24 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_27: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_27 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_28: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_28 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_29: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_29 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer yolo_31 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_31: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_31 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_33: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_33 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer preMul_33 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer postMul_33 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer mm1_33 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer mm2_33 is not supported on DLA, falling back to GPU. ERROR: [TRT]: leaky_36: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA. WARNING: [TRT]: Default DLA is enabled but layer leaky_36 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer yolo_38 is not supported on DLA, falling back to GPU. INFO: [TRT]: mm1_33: broadcasting input0 to make tensors conform, dims(input0)=[1,26,13][NONE] dims(input1)=[128,13,13][NONE]. INFO: [TRT]: mm2_33: broadcasting input1 to make tensors conform, dims(input0)=[128,26,13][NONE] dims(input1)=[1,13,26][NONE]. WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_6 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_8 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_13 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_14 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_16 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_21 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_22 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_24 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_28 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_36 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_30 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_37 INFO: [TRT]: INFO: [TRT]: --------------- Layers running on DLA: INFO: [TRT]: {conv_1,batch_norm_1}, {conv_2,batch_norm_2}, {conv_3,batch_norm_3}, {conv_5,batch_norm_5}, {maxpool_10,conv_11,batch_norm_11}, {maxpool_18,conv_19,batch_norm_19}, {maxpool_26,conv_27,batch_norm_27}, {conv_29,batch_norm_29,conv_33,batch_norm_33}, INFO: [TRT]: --------------- Layers running on GPU: INFO: [TRT]: preMul_33, postMul_33, leaky_1, leaky_2, leaky_3, (Unnamed Layer 10) [Slice], leaky_5, conv_6, leaky_6, conv_8, leaky_8, (Unnamed Layer 8) [Activation]_output copy, leaky_11, (Unnamed Layer 27) [Slice], conv_13, leaky_13, conv_14, leaky_14, conv_16, leaky_16, (Unnamed Layer 25) [Activation]_output copy, leaky_19, (Unnamed Layer 44) [Slice], conv_21, leaky_21, conv_22, leaky_22, conv_24, leaky_24, (Unnamed Layer 42) [Activation]_output copy, leaky_27, conv_28, leaky_28, leaky_29, leaky_33, mm1_33, mm2_33, conv_30, (Unnamed Layer 75) [Matrix Multiply]_output copy, (Unnamed Layer* 54) [Activation]_output copy, conv_36, yolo_31, leaky_36, conv_37, yolo_38, INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output. INFO: [TRT]: Detected 1 inputs and 2 output network tensors. Building complete
0:02:48.599896399 13265 0x7f20002300 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:
1 OUTPUT kFLOAT yolo_31 24x13x13
2 OUTPUT kFLOAT yolo_38 24x26x26
0:02:48.925054321 13265 0x7f20002300 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:
Some layers can't run in DLA core. It's a TensorRT issue. I think it will be implemented in future releases of TensorRT/DeepStream SDK.
what could be the reason of the following error
trying to run on Xavier NX
ERROR: [TRT]: ../rtExt/dla/native/dlaExecuteRunner.cpp (135) - Assertion Error in updateContextResources: 0 (execParams.dlaCore >= 0 && execParams.dlaCore < core->numEngines())
Building engine failed
Failed to build CUDA engine on yolov4-tiny-3l-1024.cfg
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:06.471895338 18672 0x55c26e8f30 ERROR nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:00:06.472071755 18672 0x55c26e8f30 ERROR nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:00:06.472165676 18672 0x55c26e8f30 ERROR nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:00:06.472758673 18672 0x55c26e8f30 WARN nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:06.472804529 18672 0x55c26e8f30 WARN nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:655>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(810): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed
Did you followed these steps? You need to recompile nvdsinfer_custom_impl_Yolo too.
@marcoslucianops yes !, it seems that for batch-size=1 i am able to use dla, but on batch size >= 2 it doesn't work.
INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
INFO: [TRT]: Detected 1 inputs and 3 output network tensors.
ERROR: [TRT]: ../builder/cudnnBuilder2.cpp (1757) - Assertion Error in operator(): 0 (et.region->getType() == RegionType::kNVM)
Building engine failed
Failed to build CUDA engine on drone_tiny_3l_1024_test.cfg
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:03:12.310730296 10901 0x31115e30 ERROR nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:03:12.310807833 10901 0x31115e30 ERROR nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:03:12.310911066 10901 0x31115e30 ERROR nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:03:12.311629282 10901 0x31115e30 WARN nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:03:12.311680930 10901 0x31115e30 WARN nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo_60fps_awesome_tracking/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
it this fix updated? because the function prototype
nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder)
doesn't match with
nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder, nvinfer1::IBuilderConfig* config)
I don't know if it will work on DLA because I don't have board to test. Can you check please?
i am checking but it can't compile because the method in the new version has a different prototype, so maybe it's updated and the fix you provided time ago wont' work anymore?
Can you test only adding these lines in config_infer_primary.txt file (in [property] section):
enable-dla=1
use-dla-core=0
seems to ignore it, in fact it use model_b2_gpu0_fp16.engine
Deserialize yoloLayer plugin: yolo_93
Deserialize yoloLayer plugin: yolo_96
Deserialize yoloLayer plugin: yolo_99
Running Healthcheck
0:00:07.012733708 22422 0x7f8c82a550 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:
1 OUTPUT kFLOAT yolo_93 255x80x80
2 OUTPUT kFLOAT yolo_96 255x40x40
3 OUTPUT kFLOAT yolo_99 255x20x20
Can you send the output when the model is building?
in fact the engine fails building, i deleted the .engine file, and i got, with batch_size = 2 the following output
ERROR: Deserialize engine failed because file path: /home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine open error
0:00:02.280438983 7734 0x7f7882a350 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:
Loading pre-trained weights Running Healthcheck Loading weights of /home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/fogsphere/deepstream/models/yolov5s6/yolov5s complete Total weights read: 7254397 Building YOLO network
layer input output weightPtr
(0) conv_silu 3 x 640 x 640 32 x 320 x 320 3584
(1) conv_silu 32 x 320 x 320 64 x 160 x 160 22272
(2) conv_silu 64 x 160 x 160 32 x 160 x 160 24448
(3) route - 64 x 160 x 160 24448
(4) conv_silu 64 x 160 x 160 32 x 160 x 160 26624
(5) conv_silu 32 x 160 x 160 32 x 160 x 160 27776
(6) conv_silu 32 x 160 x 160 32 x 160 x 160 37120
(7) shortcut_linear: 4 - 32 x 160 x 160 -
(8) route - 64 x 160 x 160 37120
(9) conv_silu 64 x 160 x 160 64 x 160 x 160 41472
(10) conv_silu 64 x 160 x 160 128 x 80 x 80 115712
(11) conv_silu 128 x 80 x 80 64 x 80 x 80 124160
(12) route - 128 x 80 x 80 124160
(13) conv_silu 128 x 80 x 80 64 x 80 x 80 132608
(14) conv_silu 64 x 80 x 80 64 x 80 x 80 136960
(15) conv_silu 64 x 80 x 80 64 x 80 x 80 174080
(16) shortcut_linear: 13 - 64 x 80 x 80 -
(17) conv_silu 64 x 80 x 80 64 x 80 x 80 178432
(18) conv_silu 64 x 80 x 80 64 x 80 x 80 215552
(19) shortcut_linear: 16 - 64 x 80 x 80 -
(20) route - 128 x 80 x 80 215552
(21) conv_silu 128 x 80 x 80 128 x 80 x 80 232448
(22) conv_silu 128 x 80 x 80 256 x 40 x 40 528384
(23) conv_silu 256 x 40 x 40 128 x 40 x 40 561664
(24) route - 256 x 40 x 40 561664
(25) conv_silu 256 x 40 x 40 128 x 40 x 40 594944
(26) conv_silu 128 x 40 x 40 128 x 40 x 40 611840
(27) conv_silu 128 x 40 x 40 128 x 40 x 40 759808
(28) shortcut_linear: 25 - 128 x 40 x 40 -
(29) conv_silu 128 x 40 x 40 128 x 40 x 40 776704
(30) conv_silu 128 x 40 x 40 128 x 40 x 40 924672
(31) shortcut_linear: 28 - 128 x 40 x 40 -
(32) conv_silu 128 x 40 x 40 128 x 40 x 40 941568
(33) conv_silu 128 x 40 x 40 128 x 40 x 40 1089536
(34) shortcut_linear: 31 - 128 x 40 x 40 -
(35) route - 256 x 40 x 40 1089536
(36) conv_silu 256 x 40 x 40 256 x 40 x 40 1156096
(37) conv_silu 256 x 40 x 40 512 x 20 x 20 2337792
(38) conv_silu 512 x 20 x 20 256 x 20 x 20 2469888
(39) route - 512 x 20 x 20 2469888
(40) conv_silu 512 x 20 x 20 256 x 20 x 20 2601984
(41) conv_silu 256 x 20 x 20 256 x 20 x 20 2668544
(42) conv_silu 256 x 20 x 20 256 x 20 x 20 3259392
(43) shortcut_linear: 40 - 256 x 20 x 20 -
(44) route - 512 x 20 x 20 3259392
(45) conv_silu 512 x 20 x 20 512 x 20 x 20 3523584
(46) conv_silu 512 x 20 x 20 256 x 20 x 20 3655680
(47) maxpool 256 x 20 x 20 256 x 20 x 20 3655680
(48) maxpool 256 x 20 x 20 256 x 20 x 20 3655680
(49) maxpool 256 x 20 x 20 256 x 20 x 20 3655680
(50) route - 1024 x 20 x 20 3655680
(51) conv_silu 1024 x 20 x 20 512 x 20 x 20 4182016
(52) conv_silu 512 x 20 x 20 256 x 20 x 20 4314112
(53) upsample 256 x 20 x 20 256 x 40 x 40 -
(54) route - 512 x 40 x 40 4314112
(55) conv_silu 512 x 40 x 40 128 x 40 x 40 4380160
(56) route - 512 x 40 x 40 4380160
(57) conv_silu 512 x 40 x 40 128 x 40 x 40 4446208
(58) conv_silu 128 x 40 x 40 128 x 40 x 40 4463104
(59) conv_silu 128 x 40 x 40 128 x 40 x 40 4611072
(60) route - 256 x 40 x 40 4611072
(61) conv_silu 256 x 40 x 40 256 x 40 x 40 4677632
(62) conv_silu 256 x 40 x 40 128 x 40 x 40 4710912
(63) upsample 128 x 40 x 40 128 x 80 x 80 -
(64) route - 256 x 80 x 80 4710912
(65) conv_silu 256 x 80 x 80 64 x 80 x 80 4727552
(66) route - 256 x 80 x 80 4727552
(67) conv_silu 256 x 80 x 80 64 x 80 x 80 4744192
(68) conv_silu 64 x 80 x 80 64 x 80 x 80 4748544
(69) conv_silu 64 x 80 x 80 64 x 80 x 80 4785664
(70) route - 128 x 80 x 80 4785664
(71) conv_silu 128 x 80 x 80 128 x 80 x 80 4802560
(72) conv_silu 128 x 80 x 80 128 x 40 x 40 4950528
(73) route - 256 x 40 x 40 4950528
(74) conv_silu 256 x 40 x 40 128 x 40 x 40 4983808
(75) route - 256 x 40 x 40 4983808
(76) conv_silu 256 x 40 x 40 128 x 40 x 40 5017088
(77) conv_silu 128 x 40 x 40 128 x 40 x 40 5033984
(78) conv_silu 128 x 40 x 40 128 x 40 x 40 5181952
(79) route - 256 x 40 x 40 5181952
(80) conv_silu 256 x 40 x 40 256 x 40 x 40 5248512
(81) conv_silu 256 x 40 x 40 256 x 20 x 20 5839360
(82) route - 512 x 20 x 20 5839360
(83) conv_silu 512 x 20 x 20 256 x 20 x 20 5971456
(84) route - 512 x 20 x 20 5971456
(85) conv_silu 512 x 20 x 20 256 x 20 x 20 6103552
(86) conv_silu 256 x 20 x 20 256 x 20 x 20 6170112
(87) conv_silu 256 x 20 x 20 256 x 20 x 20 6760960
(88) route - 512 x 20 x 20 6760960
(89) conv_silu 512 x 20 x 20 512 x 20 x 20 7025152
(90) route - 128 x 80 x 80 7025152
(91) conv_logistic 128 x 80 x 80 255 x 80 x 80 7058047
(92) yolo 255 x 80 x 80 255 x 80 x 80 7058047
(93) route - 256 x 40 x 40 7058047
(94) conv_logistic 256 x 40 x 40 255 x 40 x 40 7123582
(95) yolo 255 x 40 x 40 255 x 40 x 40 7123582
(96) route - 512 x 20 x 20 7123582
(97) conv_logistic 512 x 20 x 20 255 x 20 x 20 7254397
(98) yolo 255 x 20 x 20 255 x 20 x 20 7254397
Output YOLO blob names:
yolo_93
yolo_96
yolo_99
Total number of YOLO layers: 273
Building YOLO network complete
Building the TensorRT Engine
WARNING: [TRT]: route_3: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_3 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_12: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_12 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_24: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_24 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_39: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_39 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer upsample_53 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_56: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_56 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer upsample_63 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_66: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_66 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_75: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_75 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_84: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_84 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_90: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_90 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer yolo_93 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_93: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_93 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer yolo_96 is not supported on DLA, falling back to GPU. WARNING: [TRT]: route_96: Concatenation on DLA requires at least two inputs. WARNING: [TRT]: Default DLA is enabled but layer route_96 is not supported on DLA, falling back to GPU. WARNING: [TRT]: Default DLA is enabled but layer yolo_99 is not supported on DLA, falling back to GPU. WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_1 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer route_54 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer route_64 WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_98 WARNING: [TRT]: Detected invalid timing cache, setup a local cache instead
ERROR: [TRT]: 2: [nvdlaUtils.cpp::getInputDesc::176] Error Code 2: Internal Error (Assertion idx < num failed.Index is out of range of valid number of input tensors.) Building engine failed
@willosonico, it's an issue in TensorRT: https://forums.developer.nvidia.com/t/error-while-building-engine-on-tensorrt8-0-2/202978/3.
https://forums.developer.nvidia.com/t/how-to-use-dla-in-deepstream-yolov5/161550/25
Hi @marcoslucianops , I used deepstream-yolov4, and I check out the engine. That been build on GPU. I saw the article before. How do I fix following program for building DLA engine?
https://github.com/marcoslucianops/DeepStream-Yolo/blob/470ed82658a5546b55185b3223f8057ecf54cf88/native/nvdsinfer_custom_impl_Yolo/yolo.cpp#L74-L81