Closed guanfuchen closed 1 month ago
It's just about converting the Paddle model to the ONNX model, aligning the preprocessing, front-end processing, and post-processing with the official inference. There is a memory leak when directly using the Paddle framework for inference.
test_ocr.py在gpu上运行比cpu慢,cpu运行0.361s,但是gpu需要花9.466s,有大佬测试一下么?是什么原因
这个问题和cuda的版本有关系,建议看一下onnxruntime-gpu和cuda的对应版本关系
---Original--- From: @.> Date: Tue, Jul 16, 2024 11:56 AM To: @.>; Cc: @.**@.>; Subject: Re: [jingsongliujing/OnnxOCR] Inference speed Fast (Issue #9)
test_ocr.py在gpu上运行比cpu慢,cpu运行0.361s,但是gpu需要花9.466s,有大佬测试一下么?是什么原因
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
我测试cudatoolkit11.3,cudnn8.4.0,onnxruntime-gpu 1.15.0,cudatoolkit11.3,cudnn8.2.1,onnxruntime-gpu 1.14.1,cudatoolkit11.6.2,cudnn8.8.0.121,onnxruntime-gpu 1.17.0 效果均不好。作者大大用的是哪一个版本呢?
看一下你本地的cuda驱动版本,得一一对应:https://blog.csdn.net/qq_38308388/article/details/137679214
我测试cudatoolkit11.3,cudnn8.4.0,onnxruntime-gpu 1.15.0,cudatoolkit11.3,cudnn8.2.1,onnxruntime-gpu 1.14.1,cudatoolkit11.6.2,cudnn8.8.0.121,onnxruntime-gpu 1.17.0 效果均不好。作者大大用的是哪一个版本呢?
还有,你使用onnxruntime-gpu的时候得把pip uninstall onnxruntime 卸载,两个包不能同时存在
It's just about converting the Paddle model to the ONNX model, aligning the preprocessing, front-end processing, and post-processing with the official inference. There is a memory leak when directly using the Paddle framework for inference.它只是将 Paddle 模型转换为 ONNX 模型,使预处理、前端处理和后处理与官方推理保持一致。直接使用 Paddle 框架进行推理时存在内存泄漏。
您好,我现在把test_ocr.py的逻辑封装成了一个fastapi接口,每次执行完之后执行清理缓冲,但是显存还是会往上涨,有什么其他办法吗? if torch.cuda.is_available(): torch.cuda.empty_cache()
Inference speed is mainly due to the conversion of the PaddlePaddle model to the ONNX format? Or are there other modifications made to the model itself?