Error Code 1: Cuda Runtime (an illegal memory access was encountered)

lzd-1230 commented 1 year ago

环境说明

使用笔记本的3060显卡, cuda11.2, cudnn8.1.1 项目的trt文件是从基于yolov8n-seg的模型训练得到的pt文件经过YOLOv8-TensorRT项目的export-seg.py转换得到onnx: python .\export-seg.py --weights best.pt --opset 11 --sim --input-shape 1 3 1280 1280 --device cuda:0

然后再通过 trtexec 得到的: trtexec.exe --onnx=best.onnx --saveEngine=yolov8n-seg.trt 模型的输入 img_size 为 1280x1280

错误日志

./yolov8.exe  --model="D:\file_sum\python\highway_defect_yolov8\model_data\yolov8n-seg.trt" --size=1280 --batch_size=1  --img="D:\file_sum\dataset\all\test.jpg" --show
[03/09/2023-19:07:12] [I] model_path = D:\file_sum\python\highway_defect_yolov8\model_data\yolov8n-seg.trt
[03/09/2023-19:07:12] [I] size = 1280
[03/09/2023-19:07:12] [I] batch_size = 1
[03/09/2023-19:07:12] [I] image_path = D:\file_sum\dataset\all\test.jpg
[03/09/2023-19:07:12] [I] is_show = 1
[03/09/2023-19:07:13] [I] [TRT] [MemUsageChange] Init CUDA: CPU +355, GPU +0, now: CPU 9712, GPU 1247 (MiB)
[03/09/2023-19:07:13] [I] [TRT] Loaded engine size: 377 MiB
[03/09/2023-19:07:14] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +747, GPU +264, now: CPU 10529, GPU 1889 (MiB)
[03/09/2023-19:07:14] [W] [TRT] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.1.1
[03/09/2023-19:07:14] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +379, now: CPU 0, GPU 379 (MiB)
[03/09/2023-19:07:14] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 10528, GPU 1894 (MiB)
[03/09/2023-19:07:14] [W] [TRT] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.1.1
[03/09/2023-19:07:14] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +481, now: CPU 0, GPU 860 (MiB)
[03/09/2023-19:07:14] [I] the engine's info:
[03/09/2023-19:07:14] [I] idx = 0, images: 1, 3, 1280, 1280,
[03/09/2023-19:07:14] [I] idx = 1, outputs: 1, 33600, 38,
[03/09/2023-19:07:14] [I] the context's info:
[03/09/2023-19:07:14] [I] idx = 0, images: 1, 3, 1280, 1280,
[03/09/2023-19:07:14] [I] idx = 1, outputs: 1, 33600, 38,
[03/09/2023-19:07:14] [I] 1
[03/09/2023-19:07:16] [E] [TRT] 1: [executionContext.cpp::nvinfer1::rt::ExecutionContext::executeInternal::667] Error Code 1: Cuda Runtime (an illegal memory access was encountered)

尝试解决

尝试过的方案: ①cmake中架构添加了一个86, set_property(TARGET ${PROJECT_NAME} PROPERTY CUDA_ARCHITECTURES 60 61 62 70 72 75 86) 然后重新编译工程, 但是仍然没用... ②减小推理时的图像大小--size=640... 后来想想也不对, python是可以进行正确推理的...

Aagamshah9 commented 1 year ago

I am also facing the same issue.

terminate called after throwing an instance of 'thrust::system::system_error' what(): transform: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered Aborted (core dumped)

lzd-1230 commented 1 year ago

I am also facing the same issue.

terminate called after throwing an instance of 'thrust::system::system_error' what(): transform: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered Aborted (core dumped)

I've got reply from author recently and it's seems that author haven't support all the yolov8 model such as yolov8-seg in my scenario, so it's not my config question, it's about the projectI've got reply from author recently and it's seems that author haven't support all the yolov8 model such as yolov8-seg in my scenario, so it's not my config question.

FeiYull / TensorRT-Alpha