Closed arjunsuresh closed 1 year ago
This is discussed in the inference WG meeting and there are no objections there. Waiting for the power WG.
If the above check is indeed required, instead of time duration delta, performance per watt delta should be compared and checked to be within X%.
I guess the point of the time duration check is to ensure similar performance between the ranging and testing mode runs. But as of now we have 4 scenarios in inference and 3 of them have early stopping enabled where the runs automatically stops after 10 minutes if proper inputs are provided. In this case even if the performance varies by more than 5%, the checker passes and the below result is an example from 3.0 inference results where this has happened
Since the checker is effective only for one of the 4 scenarios - offline - and considering the failure on systems where there are high r2r variations like happened here, I request this test to be removed from the coming rounds.
This is another result where the inferencing during testing mode stopped for 15s which effectively brought the power usage ~1% down but the performance actually went up by 2.87%. If we assume this 15s stoppage did not happen, the time duration delta between the ranging and testing modes will be >5%.