Request to remove the 5% time duration delta check between the ranging and testing modes

arjunsuresh commented 1 year ago

I guess the point of the time duration check is to ensure similar performance between the ranging and testing mode runs. But as of now we have 4 scenarios in inference and 3 of them have early stopping enabled where the runs automatically stops after 10 minutes if proper inputs are provided. In this case even if the performance varies by more than 5%, the checker passes and the below result is an example from 3.0 inference results where this has happened

Ranging mode	Testing mode	Latency delta	Code
125.062243	116.239871	7.59%	result

Since the checker is effective only for one of the 4 scenarios - offline - and considering the failure on systems where there are high r2r variations like happened here, I request this test to be removed from the coming rounds.

This is another result where the inferencing during testing mode stopped for 15s which effectively brought the power usage ~1% down but the performance actually went up by 2.87%. If we assume this 15s stoppage did not happen, the time duration delta between the ranging and testing modes will be >5%.

arjunsuresh commented 1 year ago

This is discussed in the inference WG meeting and there are no objections there. Waiting for the power WG.

arjunsuresh commented 1 year ago

If the above check is indeed required, instead of time duration delta, performance per watt delta should be compared and checked to be within X%.

mlcommons / power-dev

Request to remove the 5% time duration delta check between the ranging and testing modes #301