mlcommons / inference_policies

Issues related to MLPerf™ Inference policies, including rules and suggested changes
https://mlcommons.org/en/groups/inference/
Apache License 2.0
57 stars 52 forks source link

Encouraging results density #270

Open psyhtest opened 1 year ago

psyhtest commented 1 year ago

According to the current rules, a submitter may intentionally or unintentionally introduce sparsity in the results table.

For example, if they choose a different system name for each workload they submit on essentially the same system:

awesome_system_for_resnet50
awesome_system_for_retinanet
...

the results table will contain one row per workload. One issue is that such a submitter may get an unfair advantage when their system is picked for audit - with only one workload being subject to audit.

We should mould the rules to encourage results density by carefully defining what constitutes "essentially the same system" and what does not.

psyhtest commented 1 year ago

Notable examples in v3.0 and v3.1 include CPU-only submissions with names 1-node-2S-SPR-PyTorch-INT8 and 1-node-2S-SPR-PyTorch-MIX, which could be folded into the same row.

On the other hand, 1-node-2S-SPR-PyTorch-INT4+INT8 and 1-node-2S-SPRHBM-PyTorch-BF16 could be viewed as sufficiently different.

mrmhodak commented 1 year ago

Recommendation can be added to the Rules. Intel will make sure this will not happen again.