Closed joihn closed 2 years ago
This is due to a missing target for the TRT build, and it will be fixed in jetpack 5.0
I suspected the carrier board software patch to be somewhat responsible.
Therefore I tried with another carrier board I had (salvaged from a Xavier32GB devkit, reference 945-82972-0045-0000). and reflashed everything with jetpack 4.6.1 without any issue.
However, when executing trtexec --fp16 --onnx=/home/maxime/model.onnx --saveEngine=out.engine
I have the following error (different from the previous one):
[05/24/2022-03:19:03] [E] Error[2]: [utils.cpp::checkMemLimit::380] Error Code 2: Internal Error (Assertion upperBound != 0 failed. Unknown embedded device detected. Please update the table with the entry: {{1794, 8, 64}, 51309},)
[05/24/2022-03:19:03] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[05/24/2022-03:19:03] [E] Engine could not be created from network
[05/24/2022-03:19:03] [E] Building engine failed
[05/24/2022-03:19:04] [E] Failed to create engine from model.
[05/24/2022-03:19:04] [E] Engine set up failed
It indeed looks like a missing target for the TRT build.
a ) If I understand correctly, the whole TensorRT suit is incompatible with Xavier AGX 64GB, and there is no plan to support it on jetpack 4.6.x ? It's very suprising since this nvidia article claims the opposite.
b) is there any workaround for jetpack 4.6.x ?
The OS does support AGX 64GB but TRT was not tested on AGX 64GB, so we didn't catch this failure. As Zero said, this will be fixed in JP5.0 and for JP4.6.x, I think the only workaround is to somehow limit the memory to 32GB so that the target checking logic in TRT doesn't fail.
In the latest TRT, we have changed that part of the logic so that it still works if the device is not in the pre-defined list. So if there are new devices coming out in the future, TRT will not fail at this part again.
Closing due to >14 days without activity. Please feel free to reopen if the issue still exists. Thanks
I also faced this issue for Jetson AGX Xavier 64GB model with JetPack 4.6.2. Currently there is no JetPack 5.x compatible BSP for the carrier board I'm using.
@nvpohanh
I think the only workaround is to somehow limit the memory to 32GB so that the target checking logic in TRT doesn't fail.
Do you have any idea about how to do this?
I have tried to limit kernel memory when docker run
by adding --kernel-memory=32gb
, but it did not work.
I'm converting a ONNX model to
.engine
trtexec --fp16 --onnx=/home/maxime/model.onnx --saveEngine=out.engine
This commands used to work well on nvidia xavier AGX 32gb with jetpack 4.6. However, I recently upgraded to nvidia xavier AGX 64gb and I have the following segfault (tested both on jetpack 4.6 and 4.6.1)
Environment
TensorRT Version: TensorRT v8201
CUDA Version: cuda_10.2 Operating System: jetpack 4.6.1
Hardware
NVIDIA GPU: xavier AGX64 with a CTI carrier board (ref AGX111)