mlcommons / inference_policies

Issues related to MLPerf™ Inference policies, including rules and suggested changes
https://mlcommons.org/en/groups/inference/
Apache License 2.0
57 stars 52 forks source link

Inference Rules Updates for multiple Nodes #265

Closed liorkhe closed 1 year ago

liorkhe commented 1 year ago

Clarification of Network inference rules for supporting SUT containing Multiple Nodes.

github-actions[bot] commented 1 year ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

DilipSequeira commented 1 year ago

My recollection is that some people did indeed want to see the latency from the SUT nodes as well as the names (I don't have a strong position on this.) If we want that, then we should have QDL report that information to loadgen, and we should require it in the rules.

Even if we are reporting the latency of the individual SUT nodes, I don't see any advantage in querying them serially.