NVIDIA-AI-IOT / cuDLA-samples

YOLOv5 on Orin DLA
Other
180 stars 17 forks source link

IOFormat #36

Open Railcalibur opened 4 months ago

Railcalibur commented 4 months ago

I use --inputIOFormats=fp16:chw16 --outputIOFormats=fp16:chw16 --buildDLAStandalone to build dla loadable, input is image with shape nchw, failed to build engine

[05/13/2024-15:40:54] [W] [TRT] I/O reformatting for region img, formats [in] Half(225280,225280:16,640,1), [out] Int8(225280,1:4,640,1)
[05/13/2024-15:40:54] [W] [TRT] No implementation conforms with I/O format restrictions; at least 1 reformatting nodes are needed.
[05/13/2024-15:40:54] [E] Error[4]: [optimizer.cpp::checkIfDirectIOIsPossible::4316] Error Code 4: Internal Error (BuilderFlag::kDIRECT_IO specified but no conformant implementation exists)
[05/13/2024-15:40:54] [E] Error[2]: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[05/13/2024-15:40:54] [E] Engine could not be created from network
[05/13/2024-15:40:54] [E] Building engine failed
[05/13/2024-15:40:54] [E] Failed to create engine from model or file.
[05/13/2024-15:40:54] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=translated/model_noqdq.onnx --calib=translated/model_precision_config_calib.cache --useDLACore=0 --int8 --fp16 --saveEngine=model.dla --precisionConstraints=prefer --layerPrecisions=xxx, --inputIOFormats=fp16:chw16 --outputIOFormats=fp16:chw16 --buildDLAStandalone

the first layer of model is conv, I builded engine success with --inputIOFormats=int8:dla_hwc4 --outputIOFormats=fp16:chw16 --buildDLAStandalone

if I add img mean and std into input and add normalize computation at the beginning of the model, it would success building engine with --inputIOFormats=fp16:chw16 --outputIOFormats=fp16:chw16 --buildDLAStandalone

I wonder why this error happened and how to deal with it if I want to use fp16 input ?

lynettez commented 1 month ago

Hi @Railcalibur, The error occurred because the first layer was set to use int8 precision. If you want to use fp16 input, please set the precision of the first layer to fp16.