phoronix-test-suite / test-profiles

A read-only Git copy of the OpenBenchmarking.org test profiles.
Other
72 stars 83 forks source link

Modify the score calculation method of OpenCV #285

Open casualwind opened 1 year ago

casualwind commented 1 year ago

Problem: The current score in PTS/OpenCV-1.3.0 is the wall time of all cases and it is not a fixed workload, such a score calculation is wrong.

The steps and methods introduced by OpenCV for performance testing are as follows: https://github.com/opencv/opencv/wiki/HowToUsePerfTests#running-performance-tests-and-analyzing-the-results

• Description of the problem: The score of PTS/OpenCV-1.3.0, which is the wall time, should be with the fixed work load. But the number of rounds each case runs is not a fixed value. If the coefficient of variation(standard deviation/mean) of the previous rounds of the case that has been run before is greater than 3.0%, it will continue to run the next round, and if not, it will stop. The number of rounds is determined according to stability: the coefficient of variation which is clearly indeterminate.

• Example of the problem: Here is the operation log of a case:

[ RUN ] stitchDatasets_affine.affine/5, where GetParam() = ("newspaper", "akaze") [ PERFSTAT ] (samples=11 mean=923.62 median=916.97 min=886.05 stddev=27.25 (3.0%)) [ OK ] stitchDatasets_affine.affine/5 (10193 ms)

10193 ms is the running time of the above case. The round number is 11 in the above case. 10193 ms is the computing time for the case running 11 times (923.62ms per round) plus the initialization time. In the same experimental setting, the round number could be 12, or 15, or some other value, then the total time may much greater than 10193 ms. The round number is determined by a random distribution, resulting in a different total code path each time, i.e., a different workload.

Solution: The proposed score calculation still uses the tool functions provided by OpenCV, which is composed of the geomean of the mean time of all the running cases.