NVIDIA-AI-IOT / cuDLA-samples

YOLOv5 on Orin DLA
Other
167 stars 17 forks source link

build dla standalone loadable error #2

Closed mrfsc closed 11 months ago

mrfsc commented 11 months ago

when i run _sudo bash data/model/build_dla_standaloneloadable.sh at orin, it reports errors as:

[08/17/2023-14:54:41] [I] [TRT] [MemUsageChange] Init CUDA: CPU +218, GPU +0, now: CPU 242, GPU 9671 (MiB) [08/17/2023-14:54:45] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +351, GPU +332, now: CPU 612, GPU 10022 (MiB) [08/17/2023-14:54:45] [I] Start parsing network model [libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:9: Message type "onnx2trt_onnx.ModelProto" has no field named "version". [08/17/2023-14:54:45] [E] [TRT] ModelImporter.cpp:735: Failed to parse ONNX model from file: data/model/yolov5s_trimmed_reshape_tranpose.onnx [08/17/2023-14:54:45] [E] Failed to parse onnx file [08/17/2023-14:54:45] [I] Finish parsing network model [08/17/2023-14:54:45] [E] Parsing model failed [08/17/2023-14:54:45] [E] Failed to create engine from model or file. [08/17/2023-14:54:45] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8401] # /usr/src/tensorrt/bin/trtexec --onnx=data/model/yolov5s_trimmed_reshape_tranpose.onnx --verbose --fp16 --saveEngine=data/loadable/yolov5.fp16.fp16chw16in.fp16chw16out.standalone.bin --inputIOFormats=fp16:chw16 --outputIOFormats=fp16:chw16 --buildDLAStandalone --useDLACore=0

jetpack: 5.0.2 cuda: 11.4 cudnn: 8.4.1.50 tensorrt: 8.4.1.5

is there any problem when code runs with jetpack of old version?

zerollzeng commented 11 months ago

Hi, did you install git-lfs? what's the output of 'ls -lh data/model'?

zerollzeng commented 11 months ago

My guess is you didn't pull the model file with git-lfs.

zerollzeng commented 11 months ago

Also since you are using TRT 8.4, you need to apply the trtexec patch under data/, please refer to README to apply it.

mrfsc commented 11 months ago

thanks for reply, it seems i did not install git-lfs before, but when i apply trtexec patch, it returns error as:

warning: samples/common/sampleEngines.cpp has type 100644, expected 100755 error: patch failed: samples/common/sampleEngines.cpp:968 error: samples/common/sampleEngines.cpp: patch does not apply warning: samples/common/sampleOptions.cpp has type 100644, expected 100755 error: patch failed: samples/common/sampleOptions.cpp:844 error: samples/common/sampleOptions.cpp: patch does not apply warning: samples/common/sampleOptions.h has type 100644, expected 100755 error: patch failed: samples/common/sampleOptions.h:147 error: samples/common/sampleOptions.h: patch does not apply

i try to run sudo git apply --reject trtexec-dla-standalone.patch to find conflicts, it returns:

**Checking patch samples/common/sampleEngines.cpp... warning: samples/common/sampleEngines.cpp has type 100644, expected 100755 error: while searching for: setLayerOutputTypes(network, build.layerOutputTypes); }

if (build.safe)
{
    config.setEngineCapability(sys.DLACore != -1 ? EngineCapability::kDLA_STANDALONE : EngineCapability::kSAFETY);
}

if (build.restricted)

error: patch failed: samples/common/sampleEngines.cpp:968 error: while searching for: config.setDLACore(sys.DLACore); config.setFlag(BuilderFlag::kPREFER_PRECISION_CONSTRAINTS);

        if (sys.fallback)
        {
            config.setFlag(BuilderFlag::kGPU_FALLBACK);

error: patch failed: samples/common/sampleEngines.cpp:986 Checking patch samples/common/sampleOptions.cpp... warning: samples/common/sampleOptions.cpp has type 100644, expected 100755 error: while searching for: getAndDelOption(arguments, "--fp16", fp16); getAndDelOption(arguments, "--int8", int8); getAndDelOption(arguments, "--safe", safe); getAndDelOption(arguments, "--consistency", consistency); getAndDelOption(arguments, "--restricted", restricted); getAndDelOption(arguments, "--buildOnly", buildOnly);

error: patch failed: samples/common/sampleOptions.cpp:844 error: while searching for: } if (build.safe && system.DLACore >= 0) { auto checkSafeDLAFormats = [](std::vector const& fmt) { return fmt.empty() ? false : std::all_of(fmt.begin(), fmt.end(), [](IOFormat const& pair) { bool supported{false}; bool const isLINEAR{pair.second == 1U << static_cast(nvinfer1::TensorFormat::kLINEAR)}; bool const isCHW4{pair.second == 1U << static_cast(nvinfer1::TensorFormat::kCHW4)}; bool const isCHW32{pair.second == 1U << static_cast(nvinfer1::TensorFormat::kCHW32)}; bool const isCHW16{pair.second == 1U << static_cast(nvinfer1::TensorFormat::kCHW16)}; supported |= pair.first == nvinfer1::DataType::kINT8 && (isLINEAR || isCHW4 || isCHW32); supported |= pair.first == nvinfer1::DataType::kHALF && (isLINEAR || isCHW4 || isCHW16); return supported; }); }; if (!checkSafeDLAFormats(build.inputFormats) || !checkSafeDLAFormats(build.outputFormats)) { throw std::invalid_argument( "I/O formats for safe DLA capability are restricted to fp16/int8:linear, fp16:chw16 or int8:chw32"); } if (system.fallback) {

error: patch failed: samples/common/sampleOptions.cpp:1229 error: while searching for: " type ::= \"fp32\"|\"fp16\"|\"int32\"|\"int8\"[\"+\"type]" "\n" " --calib= Read INT8 calibration cache file" "\n" " --safe Enable build safety certified engine" "\n" " --consistency Perform consistency checking on safety certified engine" "\n" " --restricted Enable safety scope checking with kSAFETY_SCOPE build flag" "\n" " --saveEngine= Save the serialized engine" "\n"

error: patch failed: samples/common/sampleOptions.cpp:1821 Checking patch samples/common/sampleOptions.h... warning: samples/common/sampleOptions.h has type 100644, expected 100755 error: while searching for: LayerPrecisions layerPrecisions; LayerOutputTypes layerOutputTypes; bool safe{false}; bool consistency{false}; bool restricted{false}; bool buildOnly{false};

error: patch failed: samples/common/sampleOptions.h:147 Applying patch samples/common/sampleEngines.cpp with 2 rejects... Rejected hunk #1. Rejected hunk #2. Applying patch samples/common/sampleOptions.cpp with 3 rejects... Rejected hunk #1. Rejected hunk #2. Rejected hunk #3. Applying patch samples/common/sampleOptions.h with 1 reject... Rejected hunk #1.**

it looks like new ptach has reflicts with trt8.4?

zerollzeng commented 11 months ago

Sorry I didn't make it clear, already updated: The patch is on top of TRT 8.5, for other version you need to apply the patch manually and rebuilt. We will see if we can add a TRT 8.4 patch, you can also contribute it to us and we will appreciate for it :-)

mrfsc commented 11 months ago

Well, it's a hard job to compare trt8.4 and trt8.6 to make git apply patch, i tried to update tensorrt only without update jetpack by:

**sudo vi /etc/apt/sources.list.d/nvidia-l4t-apt-source.list alert file as: deb https://repo.download.nvidia.com/jetson/common r35.4 main deb https://repo.download.nvidia.com/jetson/t234 r35.4 main

sudo apt update sudo apt-get install tensorrt **

Finally, i updated my tensorrt to 8.5
and run build dla standalone successfully, thanks for your nice woks and hope this will helps :)

hygxy commented 8 months ago

sudo apt update sudo apt-get install tensorrt

Hey, do you how to switch back to trt8.4? I run your commands but still get error and now I plan want to switch back to trt8.4. I recovered the file /etc/apt/sources.list.d/nvidia-l4t-apt-source.list to

deb https://repo.download.nvidia.com/jetson/common r35.1 main
deb https://repo.download.nvidia.com/jetson/t234 r35.1 main

and then run sudo apt update && sudo apt-get install tensorrt, but then the following error shows up:

Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 tensorrt : Depends: libnvinfer8 (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvinfer-plugin8 (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvparsers8 (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvonnxparsers8 (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvinfer-bin (= 8.4.1-1+cuda11.4) but it is not going to be installed
            Depends: libnvinfer-dev (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvinfer-plugin-dev (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvparsers-dev (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvonnxparsers-dev (= 8.4.1-1+cuda11.4) but 8.5.2-1+cuda11.4 is to be installed
            Depends: libnvinfer-samples (= 8.4.1-1+cuda11.4) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
zerollzeng commented 8 months ago

In Jetpack, you can't. You have to flash the corresponding Jetpack release.

zerollzeng commented 8 months ago

I would still suggest to use latest Jetpack release.

hygxy commented 8 months ago

I would still suggest to use latest Jetpack release.

Ok, what's the difference between flashing the corresponding jetpack release and updating all of the three components(cuda, cudnn, tensorrt) via apt-get?

zerollzeng commented 7 months ago

updating all of the three components(cuda, cudnn, tensorrt) via apt-get

We don't officially support this, I know it can be hacked but those SDK are only tested with corresponding Jetpack. If you are lucky, you got it work, if no, we can't help you.