NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.78k stars 2.13k forks source link

'tensorrt_bindings.tensorrt.IBuilderConfig' object has no attribute 'max_workspace_size' #3816

Open StephenMaturrin opened 6 months ago

StephenMaturrin commented 6 months ago

Description

In various sections of the guide, particularly within the code examples and certain textual explanations, it is stated that one should specify the maximum workspace size using the config.max_workspace_size parameter during engine creation. However, in another part of the guide, it suggests utilizing config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 20) for the same purpose.

This inconsistency has led to confusion in understanding the recommended approach for setting the maximum workspace size. It would be greatly appreciated if you could provide clarification on which method is the correct and recommended way to configure the workspace memory size during engine creation..

Environment

TensorRT Version: 10.0.06b

NVIDIA GPU: RTX3070

NVIDIA Driver Version: 12.1

CUDA Version: 12

CUDNN Version: 8.9.2

StephenMaturrin commented 6 months ago

The same inconsistency can be found with builder.build_serialized_network vs self.builder.build_cuda_engine [deprecated]

lix19937 commented 6 months ago

from v8.4 the set_max_workspace_size begin deprecated.

Deprecated And Removed Features

The following features are deprecated in TensorRT 8.4.0 EA:

  • The following C++ API functions and classes were deprecated:
    • IFullyConnectedLayer
    • getMaxWorkspaceSize
    • setMaxWorkspaceSize
  • The following Python API functions and classes were deprecated:
    • IFullyConnectedLayer
    • get_max_workspace_size
    • set_max_workspace_size
Ashutosh1995 commented 6 months ago

How to resolve this issue then ?

lix19937 commented 6 months ago

@Ashutosh1995

config = builder.create_builder_config()
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 20) # 1 MiB

serialized_engine = builder.build_serialized_network(network, config)

ref https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/BuilderConfig.html#tensorrt.IBuilderConfig.set_memory_pool_limit