mlcommons / inference_policies

Issues related to MLPerf™ Inference policies, including rules and suggested changes
https://mlcommons.org/en/groups/inference/
Apache License 2.0
55 stars 52 forks source link

Update inference_rules.adoc #255

Closed arjunsuresh closed 2 years ago

arjunsuresh commented 2 years ago

Giving an example for systems that may be exempt for audit

github-actions[bot] commented 2 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

tjablin commented 2 years ago

This doesn't address changes in CPU between rounds.

arjunsuresh commented 2 years ago

Correct @tjablin. And that is intentional and that's why I used "may be". From my experience CPU becomes a factor when there is a part of the model running on the CPU or if the inference throughput is so high that data transfer time by CPU becomes significant. I believe that these are hard to put in the rules and if a submitter asks for an exemption with a different CPU with justification, review committee can discuss that.

psyhtest commented 2 years ago

This does not address quite natural software improvements between rounds resulting in, say, 5-10% performance increases. Also, I don't see a problem with allowing higher accelerator counts if the scaling is shown to be nearly linear.

arjunsuresh commented 2 years ago

@psyhtest I'm not sure either of those will be agreed by all submitter. May be we can discuss that in the review meeting and if all are fine we can include them. My aim with this PR is to include systems which won't be contentious.

arjunsuresh commented 2 years ago

@DilipSequeira Can you please share the views of Nvidia here?

DilipSequeira commented 2 years ago

@arjunsuresh I think this PR may be premature. It's a potentially difficult and contentious issue, and unless there's broad consensus (which I think there is on submission deadlines, but let's see what Krai and QC have to say), it's best not to try and drive those during the review period.

arjunsuresh commented 2 years ago

Thanks @DilipSequeira. This PR is meant to be for next round of submissions. Krai and Qualcomm is asking for a relaxed set of "similarity conditions" for the systems (just the accelerator similarity). If Nvidia is also okay with it then we can go with that.

DilipSequeira commented 2 years ago

Understood. This round, the argument needs to be made in the review committee, and from that example we might learn things that will improve this PR.

arjunsuresh commented 2 years ago

Sure. That'll be useful!