python使用onnxRuntime进行压测时，使用的是GPU计算，但是CPU也会上升，GPU和CPU都打满后停止压测，GPU资源迅速释放，但是CPU资源一直满负荷

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

https://onnxruntime.ai

MIT License

13.88k stars 2.8k forks source link

python使用onnxRuntime进行压测时，使用的是GPU计算，但是CPU也会上升，GPU和CPU都打满后停止压测，GPU资源迅速释放，但是CPU资源一直满负荷 #20657

Open banbomo opened 3 months ago

banbomo commented 3 months ago

Describe the feature request

python使用onnxRuntime进行压测时，使用的是GPU计算，但是CPU也会上升，GPU和CPU都打满后停止压测，GPU资源迅速释放，但是CPU资源一直满负荷。

Describe scenario use case

如上所述，使用的onnx runtime版本是1.10.0

tianleiwu commented 3 months ago

By default, Arena memory allocator is used so the allocated memory will not be released. For more information, search "Share allocator(s) between sessions" and "Memory arena shrinkage" in the document: https://onnxruntime.ai/docs/get-started/with-c.html

If you still need help, please share a script and test data to reproduce the issue.