core dump in bart trt engine

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

https://developer.nvidia.com/tensorrt

Apache License 2.0

10.55k stars 2.1k forks source link

core dump in bart trt engine #1793

Open lonelydancer opened 2 years ago

lonelydancer commented 2 years ago

Description

when i use trtexec to generate bart trt engine, there is core dump.

Environment

TensorRT Version: 8.2.1 NVIDIA GPU: T4 NVIDIA Driver Version: 470.82.01 CUDA Version: 11.5.50 CUDNN Version: Operating System: Ubuntu 20.04.3 LTS Python Version (if applicable): 3.8.10 Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version): onnx 1.7.0 onnx-graphsurgeon 0.3.14 onnxruntime 1.8.0 onnxruntime-gpu 1.8.0

Relevant Files

Steps To Reproduce

https://github.com/huggingface/transformers/tree/v4.15.0/examples/onnx/pytorch/summarization 1.python run_onnx_exporter.py --model_name_or_path facebook/bart-base 2.trtexec --onnx=BART.onnx --workspace=64 --minShapes=input_ids:1x1 --optShapes=input_ids:1x32 --maxShapes=input_ids:1x64 --buildOnly --saveEngine=test.engine

zerollzeng commented 2 years ago

can you share the onnx model here?

lonelydancer commented 2 years ago

zerollzeng

链接:https://pan.baidu.com/s/1obbZhy-tfjch-0B3INyvUg 密码:9d6n

zerollzeng commented 2 years ago

Sry can you share the model via GitHub directly? Baidu net disk has poor download speed.

lonelydancer commented 2 years ago

Sry can you share the model via GitHub directly? Baidu net disk has poor download speed.

the model is too large. how about google drive? https://drive.google.com/file/d/1ICVcY6PdeozIICzJj-eCwbuyJm_CoYAQ/view?usp=sharing

lonelydancer commented 2 years ago

@zerollzeng do you have any idea?

zerollzeng commented 2 years ago

should be a trt bug, we are internally debugging this. thanks for reporting it.

LightSun commented 2 years ago

I met the similar error. when try to convert onnx to tensorrt with dynamic batch. static batch is ok.

TensorRT Version: 8.2.2.1
NVIDIA GPU: rtx3070
NVIDIA Driver Version: 470.103.01
CUDA Driver Version: 11.4
CUDA runtime: 11.1
CUDNN Version: 
Operating System: Ubuntu 18.04 LTS. linux kernel-5.4.0-100-gerneral.

I have fixed it by change nvidia-driver version.

lonelydancer commented 2 years ago

I met the similar error. when try to convert onnx to tensorrt with dynamic batch. static batch is ok.
TensorRT Version: 8.2.2.1
NVIDIA GPU: rtx3070
NVIDIA Driver Version: 470.103.01
CUDA Driver Version: 11.4
CUDA runtime: 11.1
CUDNN Version: 
Operating System: Ubuntu 18.04 LTS. linux kernel-5.4.0-100-gerneral. 
I have fixed it by change nvidia-driver version.

which version?

LightSun commented 2 years ago

@lonelydancer my version is 470.74.

nvpohanh commented 2 years ago

@zerollzeng Any internal tracker (bug) was filed for this?