intel / onnxruntime

ONNX Runtime: cross-platform, high performance scoring engine for ML models
MIT License
56 stars 22 forks source link

Add model_priority as a provider option #348

Closed sspintel closed 5 months ago

sspintel commented 6 months ago

High-level OpenVINO model priority hint. Defines what model should be provided with more performant bounded resource first.

It's an optional parameter to provide a hint to the scheduler if a workload has higher or lower QoS needs.

Valid values are: LOW, MEDIUM, HIGH, DEFAULT

sspintel commented 5 months ago

Closing and cherry-picking changes to another branch as this one has a lot of conflicts.