Closed fj-y-saito closed 8 months ago
@fj-y-saito OpenVINO as library provide ov::inference_num_threads
, ov::hint::scheduling_core_type
and ov::hint::enable_cpu_pinning
to limit number of inference threads and bind them to specific type of CPU core.
However, OpenVINO does not support binding threads to specific processor since each processor may have different core type.
User can limit using specific processors on application level like using numactl
.
numactl
is process level and could not help case where two models in one process.
Suppose that I have 8 CPU cores and two resnet50 models A and B with different classification tasks. What I'd like to see is that model A only uses 0-3 cores, while model B only uses 4-7 cores. I try set inference_num_threads
to 4
and enable_cpu_pinning
to false
. The performance turns out worse than default config (both A and B use 0-7 cores).
Core Binding would befit this common use case.
@wangleis I want to control core bind of OpenVINO and related libraries like ACL or oneDNN separately. Like KindRoach case, numactl can't satisfy my demand too. So I think it needs to add new feature to OpenVINO.
@KindRoach @fj-y-saito For your example, there are two potential options:
In option 1, OV provide new low level property to set CPU mask for each model. The advantage of this option is that the user has explicit control over which model runs on any combination of CPUs. However the limitation of this option is scaling. The CPU mask for platform A may mean other type of CPU on platform B. So the application with this option may not get good performance on different platform.
In option 2, OV provide new high level property to set hint of reservation for each model. The advantage of this option is scaling. When user enable this hint for one model, OV will allocate dedicated CPU cores for this model which will not be used by other model. Different CPU mapping on different platform will not impact the performance. However, the limitation of this option is user cannot assign specific CPU id to specific model.
May I know which option is better in your real use case? Could you share more detail of your use case and explain why?
@wangleis Not related to my problem, but to answer your question, I think the former is better.This is because people who care about core bindings would probably want to design their own binding pattern. In this way, if performance is not as expected, we can heuristically try and tune binding patterns until we are satisfied. If I add one thing, I think it would be easier to use if it could be controlled with environment variables.
I come to think that my suggession is nonsence because it is not natural that OpenVINO has API of controling lower library. So I would like to withdraw this issue. Should I close this issue? Or continue to disscuss with @KindRoach?
@fj-y-saito Thanks for your sharing. Then please close this issue.
@KindRoach If you would like to continue discussing this issue, please create a new issue for your question.
@wangleis I close this issue. Thank you very much for your support.
Request Description
Binding threads to specific cores like
sched_setaffinity
system call.sched_setaffinity
system call: https://man7.org/linux/man-pages/man2/sched_setaffinity.2.htmlFeature Use Case
I'm trying to bind OpenVINO and ACL threads to specific CPU cores such as:
There are four affinity patterns to set a CPU affinity through OpenVINO Python API
core.compile_model
like:Available affinity patterns: https://docs.openvino.ai/2023.2/enumov_1_1Affinity.html#details-group-ov-runtime-cpp-prop-api-1ga72b7c6cde7f94f07bc1519464c55e9c5
But these patterns do not support to specify cores as binding targets.
Issue submission checklist