NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
When attempting to load ONNX checkpoints that have separated weight and bias files (common for ONNX files larger than 2GB) in TensorRT, the framework searches for these element files (weights and biases) in the current working directory instead of the directory that contains the ONNX checkpoint. This behavior is inconsistent with the ONNX export process, which places all element files into a single directory.
Error Message
When trying to load the exported ONNX checkpoints, the following error message is encountered:
[11/20/2024-18:55:41] [TRT] [E] WeightsContext.cpp:178: Failed to open file: stdit3_simplified.onnx.data
[11/20/2024-18:55:41] [TRT] [E] In node -1 with name: and operator: (parseGraph): INVALID_GRAPH: Failed to import initializer
In node -1 with name: and operator: (parseGraph): INVALID_GRAPH: Failed to import initializer
Proposed Solution
To resolve this issue, move all element files to the current working directory. However, this approach can become quite cumbersome and disorganized, especially for large models with hundreds of element files.
with trt.Builder(trt_logger) as builder, \
builder.create_network(explicit_batch_flag) as network, \
builder.create_builder_config() as config:
parser = trt.OnnxParser(network, trt_logger)
with open(path , 'rb') as model:
if not parser.parse(model.read()):
for error in range(parser.num_errors):
logger.info(parser.get_error(error))
return None
print('Completed parsing ONNX model')
3. Re-structure working directory like this:
current-working-directory
├── save
│ └── stdit3_simplified.onnx
└── stdit3_simplified.onnx.data
4. Try to parse again with ONNX parser (**success**)
Description
When attempting to load ONNX checkpoints that have separated weight and bias files (common for ONNX files larger than 2GB) in TensorRT, the framework searches for these element files (weights and biases) in the current working directory instead of the directory that contains the ONNX checkpoint. This behavior is inconsistent with the ONNX export process, which places all element files into a single directory.
Error Message
When trying to load the exported ONNX checkpoints, the following error message is encountered:
Proposed Solution
To resolve this issue, move all element files to the current working directory. However, this approach can become quite cumbersome and disorganized, especially for large models with hundreds of element files.
A more systematic solution might be needed to handle the organization of these files effectively.
Environment
Relevant Files
Relevant Files: https://drive.google.com/drive/folders/1nLdYn8nDPs79ZKNx8TssdQSj4x-p4qfl?usp=sharing
stdit3_simplified.onnx
: ONNX checkpointstdit3_simplified.onnx.data
: Its dataSteps To Reproduce
path = "save/stdit3_simplified.onnx" trt_logger = trt.Logger() explicit_batch_flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
with trt.Builder(trt_logger) as builder, \ builder.create_network(explicit_batch_flag) as network, \ builder.create_builder_config() as config:
current-working-directory ├── save │ └── stdit3_simplified.onnx └── stdit3_simplified.onnx.data