Open weihaosky opened 3 years ago
Hi, I'm facing the same issue. I checked all the API reference and the code is written correctly. I will check back with you if I have resolved it.
Got the same error when running using TensorRT Python 8.0.0.3.
With nvidia-tensorrt-7.2.3.4 it works fine.
!pip install nvidia-tensorrt==7.2.* --index-url https://pypi.ngc.nvidia.com
Yup, change back to TensorRT-7 and it work fine.
If you are facing any issues, I suggest to just remove and re-clone the whole repo. You'll need to change the is not
to !=
in the dummy_converters.py.
I have come across the same problem. can anyone solve this problem?
https://github.com/gcunhase/torch2trt clone and install this one is work for me
Got the same error when running using TensorRT Python 8.0.0.3. With nvidia-tensorrt-7.2.3.4 it works fine.
!pip install nvidia-tensorrt==7.2.* --index-url https://pypi.ngc.nvidia.com
sry. I try this on the jetson nano. it turns out:
Defaulting to user installation because normal site-packages is not writeable Looking in indexes: https://pypi.ngc.nvidia.com ERROR: Could not find a version that satisfies the requirement nvidia-tensorrt==7.2.* (from versions: none) ERROR: No matching distribution found for nvidia-tensorrt==7.2.*
I have no idea.help me
I am (trying) using Tensorrt 8.2.0.6. I got the same issue. Tried the above modified repo mentioned by @liuanhua110 but it did not work, same error.
So going to roll back to tensorrt 7.1.0.16 (which I know work since the code I am running works on another machine with tensorrt 7 installed there).
Mentioning this here though because I spent several hours installing tensorrt 8.2 to find out there is this compatibility issue going on.
Come on NVIDIA!
TensorRT API was updated in 8.0.1 so you need to use different commands now. As stated in their release notes "ICudaEngine.max_workspace_size" and "Builder.build_cuda_engine()" among other deprecated functions were removed. (see https://docs.nvidia.com/deeplearning/tensorrt/release-notes/tensorrt-8.html#rel_8-0-1)
The current usage that worked for me:
config = builder.create_builder_config() config.max_workspace_size = 1 << 28
-and to build engine:
plan = builder.build_serialized_network(network, config) engine = runtime.deserialize_cuda_engine(plan)
--- a/python/app_ScatterND_plugin.py
+++ b/python/app_ScatterND_plugin.py
@@ -36,7 +36,8 @@ def build_engine(shape_data, shape_indices, shape_updates):
exit()
builder = trt.Builder(logger)
- builder.max_workspace_size = 1 << 20
+ config = builder.create_builder_config()
+ config.max_workspace_size = 1 << 20
network = builder.create_network(flags=1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
tensor_data = network.add_input('data', trt.DataType.FLOAT, shape_data)
@@ -49,8 +50,11 @@ def build_engine(shape_data, shape_indices, shape_updates):
]))
)
network.mark_output(layer.get_output(0))
+ plan = builder.build_serialized_network(network, config)
- return builder.build_cuda_engine(network)
+ with trt.Runtime(logger) as runtime:
+ engine = runtime.deserialize_cuda_engine(plan)
+ return engine
`
Got the same error when running using TensorRT Python 8.0.0.3. With nvidia-tensorrt-7.2.3.4 it works fine.
!pip install nvidia-tensorrt==7.2.* --index-url https://pypi.ngc.nvidia.com
sry. I try this on the jetson nano. it turns out:
Defaulting to user installation because normal site-packages is not writeable Looking in indexes: https://pypi.ngc.nvidia.com ERROR: Could not find a version that satisfies the requirement nvidia-tensorrt==7.2.* (from versions: none) ERROR: No matching distribution found for nvidia-tensorrt==7.2.*
have you solved? i come across the same issue
Same issue on 8.4.1.5
, but for 'build_cuda_engine'
for this issue this solution is worked fine with me
builder = trt.Builder(TRT_LOGGER)
builder_config = builder.create_builder_config()
builder_config.max_workspace_size = 1 << 30
builder.max_batch_size = 1
I have no idea.help me
Have you solved this issue eventually? Been struggling with this one now
Same issue on
8.4.1.5
, but for'build_cuda_engine'
Have you solved solved this problem eventually? If so, could you share your solution?
I have solved the problem Could you pls send me the exact error
You need to refer the tensorRt document Official git page
On Thu, 19 Oct, 2023, 14:13 Kirill Klimushin, @.***> wrote:
Same issue on 8.4.1.5, but for 'build_cuda_engine' Have you solved solved this problem eventually? If so, could you share your solution?
— Reply to this email directly, view it on GitHub https://github.com/NVIDIA-AI-IOT/torch2trt/issues/557#issuecomment-1770340594, or unsubscribe https://github.com/notifications/unsubscribe-auth/BAVOFMIJHP5QRIFS4YQHQALYADRZJAVCNFSM44YR26TKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZXGAZTIMBVHE2A . You are receiving this because you commented.Message ID: @.***>
Hey, I also had this problem with TensorRT 10 and CUDA 12.1. I managed to fix it by uninstalling everything from CUDA and TensorRT and redownloading CUDA 11.8 and TensorRT 8.
--- a/python/app_ScatterND_plugin.py +++ b/python/app_ScatterND_plugin.py @@ -36,7 +36,8 @@ def build_engine(shape_data, shape_indices, shape_updates): exit() builder = trt.Builder(logger) - builder.max_workspace_size = 1 << 20 + config = builder.create_builder_config() + config.max_workspace_size = 1 << 20 network = builder.create_network(flags=1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) tensor_data = network.add_input('data', trt.DataType.FLOAT, shape_data) @@ -49,8 +50,11 @@ def build_engine(shape_data, shape_indices, shape_updates): ])) ) network.mark_output(layer.get_output(0)) + plan = builder.build_serialized_network(network, config) - return builder.build_cuda_engine(network) + with trt.Runtime(logger) as runtime: + engine = runtime.deserialize_cuda_engine(plan) + return engine `
It's not work on Tensorrt8 and10.
same problem, "max_workspace_size" is not work in TRT 10
what's more, "max_batch_size" is not work in TRT 10 too. From the Best Practices For TensorRT Performance : " The maximum batch size should also be set for the builder when building the optimized network with IBuilder::setMaxBatchSize (Builder.max_batch_size in Python). " But I don't find the way to set the Builder.max_batch_size.
@yaobaishen @ivanyordanovgt i have the same problem "'AttributeError: 'tensorrt_bindings.tensorrt.IBuilderConfig' object has no attribute 'max_workspace_size' "' when trying to convert yolov8s.pt to onnx ( Ultralytics YOLOv8.1.13 🚀 Python-3.11.5 torch-2.2.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3070 Laptop GPU, 8192MiB) with TensorRT 10.1.0. ) did you managed to solve it ? hint: i tried to downgrade tensorrt to install tensorrt 8 but i can't
The solution, as per https://github.com/NVIDIA/TensorRT/issues/3816
<...>
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 20) # 1 MiB
serialized_engine = builder.build_serialized_network(network, config)
Notice that the result engine is already serialized
The solution, as per NVIDIA/TensorRT#3816
<...> config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 20) # 1 MiB serialized_engine = builder.build_serialized_network(network, config)
Notice that the result engine is already serialized
thanks @Shushpancheak , looks it has deprecated for long time. Do you know how about the max_batch_size option? I cannot find any document about it.
fixed that this way
profile = builder.create_optimization_profile() profile.set_shape("images", (1, 3, 640, 640), (1, 3, 640, 640), (1, 3, 640, 640)) config.add_optimization_profile(profile) config.set_flag(trt.BuilderFlag.INT8)
was hunting for a quick fix for an issue, even chatgpt geminai and pilot all cant find my solution but maybe there is something here tat can help someone resolve an issue they might have import torch import torch.cuda as cuda import torch.nn as nn import torchvision.models as models from torch.utils.data import Dataset, DataLoader from torchvision import transforms from PIL import Image import glob import os import tensorrt as trt import numpy as np import ctypes import nvidia_pyindex
class SUNXDS(nn.Module):
def init(self, num_classes=9):
super(SUNXDS, self).init()
self.num_classes = num_classes
# Feature Extraction Backbone
self.backbone = models.resnet50(pretrained=True)
self.backbone = nn.Sequential(*list(self.backbone.children())[:-2])
# Additional Layers
self.conv_block1 = nn.Sequential(
nn.Conv2d(2048, 1024, kernel_size=1),
nn.BatchNorm2d(1024),
nn.LeakyReLU(0.1),
)
self.conv_block2 = nn.Sequential(
nn.Conv2d(1024, 1024, kernel_size=3, padding=1),
nn.BatchNorm2d(1024),
nn.LeakyReLU(0.1),
)
# Output layer (predictions)
self.output_layer = nn.Conv2d(1024, num_classes * 5, kernel_size=1)
def forward(self, x):
x = self.backbone(x)
x = self.conv_block1(x)
x = self.conv_block2(x)
out = self.output_layer(x)
return out
model_path = "C:\Users\donar\modelexportint8\models\sunxds_0.5.6.pt" loaded_data = torch.load(model_path, map_location=torch.device('cpu'))
if isinstance(loaded_data, torch.nn.Module):
model = loaded_data
else:
model = SUNXDS()
model_dict = model.state_dict()
filtered_dict = {k: v for k, v in loaded_data.items() if k in model_dict}
model.load_state_dict(filtered_dict, strict=False)
model.eval()
class CalibrationDataset(Dataset): def init(self, image_dir, transform=None): self.image_dir = image_dir self.transform = transform self.image_paths = glob.glob(os.path.join(image_dir, ".jpg")) # Or ".png", etc.
def __len__(self):
return len(self.image_paths)
def __getitem__(self, idx):
image_path = self.image_paths[idx]
image = Image.open(image_path).convert('RGB')
if self.transform:
image = self.transform(image)
return image, 0
calibration_data_dir = 'C:\Users\donar\modelexportint8\images\class\' transform = transforms.Compose([ transforms.Resize((640, 640)), # Update with your input size transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), # Common normalization
]) calibration_dataset = CalibrationDataset(calibration_data_dir, transform=transform) calibration_loader = DataLoader(calibration_dataset, batch_size=1, shuffle=False, num_workers=0)
class Int8Calibrator(trt.IInt8Calibrator): def init(self, dataloader, cache_file="calibration.cache"): trt.IInt8Calibrator.init(self) self.cache_file = cache_file self.dataset = dataloader.dataset self.batch_size = dataloader.batch_size self.current_index = 0 self.device_input = None # Initialize as None
# Get the first batch to determine the size
loader = DataLoader(calibration_dataset, batch_size=1, shuffle=False, num_workers=0)
batch, _ = next(iter(loader))
# Allocate device memory (outside of any context)
self.device_input = torch.cuda.memory_allocated(batch[0].nbytes * self.batch_size)
self.generator = iter(loader)
self.read_calibration_cache()
def get_batch_size(self):
return self.batch_size
def get_batch(self, names):
if self.current_index < len(self.dataset):
try:
images, _ = next(self.generator)
batch = torch.stack(images, 0)
batch = batch.numpy()
cuda.memcpy_htod(self.device_input, batch) # No need for context here
self.current_index += 1
return [int(self.device_input)]
except StopIteration:
return None
else:
return None
def read_calibration_cache(self):
if os.path.exists(self.cache_file):
with open(self.cache_file, "rb") as f:
return f.read()
def write_calibration_cache(self, cache):
with open(self.cache_file, "wb") as f:
f.write(cache)
calibrator = Int8Calibrator(calibration_loader, "calibration.cache")
dummy_input = torch.randn(1, 3, 640, 640) # Adjust input size based on your onnx_model_path = "models/export/yolov10.onnx"
torch.onnx.export( model, dummy_input, onnx_model_path, verbose=False, opset_version=12, input_names=["input"], output_names=["output"], dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}}, )
logger = trt.Logger(trt.Logger.WARNING) builder = trt.Builder(logger) network = builder.create_network(1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) parser = trt.OnnxParser(network, logger) config = builder.create_builder_config() config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 20) # 1 MiB
profile = builder.create_optimization_profile() profile.set_shape("images", (1, 3, 640, 640), (1, 3, 640, 640), (1, 3, 640, 640)) # Update input name and shape if needed config.add_optimization_profile(profile) config.set_flag(trt.BuilderFlag.INT8) # Enable INT8 mode
with open(onnx_model_path, "rb") as model: if not parser.parse(model.read()): for error in range(parser.num_errors): print(parser.get_error(error)) exit(-1)
for i in range(network.num_layers): layer = network.get_layer(i) if layer.type not in [trt.LayerType.SHAPE, trt.LayerType.IDENTITY]:
layer.precision = trt.DataType.FLOAT
layer.set_output_type(0, trt.DataType.FLOAT)
config.set_calibration_profile(profile)
config.int8_calibrator = calibrator
with builder.build_engine(network, config) as engine:
optimized_model_path = 'models/export/yolov10_int8.engine'
with open(optimized_model_path, "wb") as f:
f.write(engine.serialize())
did any of you find the final solution for yolov8?
when I used the version 10.2.0.19 of tensorrt, with builder.build_engine(network, config) as engine:
caused an AttributeError: 'tensorrt.tensorrt.Builder' object has no attribute 'build_engine'
. Replacing the entire section with:
with builder.build_serialized_network(network, config) as engine:
engine_path = 'xxx'
with open(engine_path, 'wb') as f:
f.write(engine)
worked well.
pip uninstall tensorrt pip install tensorrt==8.5.1.7
When I run the Usage demo
An error occurs:
What is the problem? Many thanks!