NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.85k stars 2.14k forks source link

[ONNXParser] TensorRT Fails to Load ONNX Checkpoints with Separated Weight and Bias Files #4257

Open theanh-ktmt opened 3 days ago

theanh-ktmt commented 3 days ago

Description

When attempting to load ONNX checkpoints that have separated weight and bias files (common for ONNX files larger than 2GB) in TensorRT, the framework searches for these element files (weights and biases) in the current working directory instead of the directory that contains the ONNX checkpoint. This behavior is inconsistent with the ONNX export process, which places all element files into a single directory.

Error Message

When trying to load the exported ONNX checkpoints, the following error message is encountered:

[11/20/2024-18:55:41] [TRT] [E] WeightsContext.cpp:178: Failed to open file: stdit3_simplified.onnx.data
[11/20/2024-18:55:41] [TRT] [E] In node -1 with name:  and operator:  (parseGraph): INVALID_GRAPH: Failed to import initializer
In node -1 with name:  and operator:  (parseGraph): INVALID_GRAPH: Failed to import initializer

Proposed Solution

To resolve this issue, move all element files to the current working directory. However, this approach can become quite cumbersome and disorganized, especially for large models with hundreds of element files.

# error
current-working-directory
└── save
    ├── stdit3.onnx
    ├── linear.1.bias
    ├── linear.1.weight
    └── ...

# proposed solution
current-working-directory
├── save
│   └── stdit3.onnx
├── linear.1.bias
├── linear.1.weight
└── ...

A more systematic solution might be needed to handle the organization of these files effectively.

Environment

Relevant Files

Relevant Files: https://drive.google.com/drive/folders/1nLdYn8nDPs79ZKNx8TssdQSj4x-p4qfl?usp=sharing

Steps To Reproduce

  1. Download ONNX checkpoint and its data from above Google Drive link, then structure working directory like this.
    current-working-directory
    └── save
    ├── stdit3_simplified.onnx
    └── stdit3_simplified.onnx.data
  2. Try to load with ONNX (success)
    import onnx
    path = "save/stdit3_simplified.onnx"
    model = onnx.load(path)
  3. Try to parse with TensorRT (error, error message as mentioned above)
    
    import tensorrt as trt

path = "save/stdit3_simplified.onnx" trt_logger = trt.Logger() explicit_batch_flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

with trt.Builder(trt_logger) as builder, \ builder.create_network(explicit_batch_flag) as network, \ builder.create_builder_config() as config:

    parser = trt.OnnxParser(network, trt_logger)
    with open(path , 'rb') as model:
        if not parser.parse(model.read()):
            for error in range(parser.num_errors):
                logger.info(parser.get_error(error))
            return None
    print('Completed parsing ONNX model')
3. Re-structure working directory like this:

current-working-directory ├── save │ └── stdit3_simplified.onnx └── stdit3_simplified.onnx.data


4. Try to parse again with ONNX parser (**success**)