openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.1k stars 2.23k forks source link

[Performance]: How to assign model inference to specific CPUs? #27083

Open LinGeLin opened 6 hours ago

LinGeLin commented 6 hours ago

OpenVINO Version

2024.4.0

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

CPU

OpenVINO installation

Build from source

Programming Language

C++

Hardware Architecture

x86 (64 bits)

Model used

ps model

Model quantization

No

Target Platform

No response

Performance issue description

I am developing a gRPC project using C++ and integrating OpenVINO (ov) into it. The project involves multiple thread pools for preprocessing. I have observed that the inference performance is significantly lower than the data measured by benchmark_app. I suspect that this is due to thread competition between ov and the preprocessing threads in the project. I conducted the following tests:

Since my project runs with two models loaded simultaneously, I want to dedicate CPUs 0-11 to Model A, CPUs 12-19 to Model B, and CPUs 20-23 for other operations in the project. However, I haven't found an interface in ov to bind CPUs when loading models. Are there any other suggestions? Thank you.

Step-by-step reproduction

No response

Issue submission checklist

wangleis commented 5 hours ago

hi @LinGeLin Do you run two models in one application process?

LinGeLin commented 5 hours ago

hi @LinGeLin Do you run two models in one application process?

yes