aai-institute / pyDVL

pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
https://pydvl.org
GNU Lesser General Public License v3.0
109 stars 8 forks source link

Create performance tests / benchmarking #20

Open AnesBenmerzoug opened 2 years ago

AnesBenmerzoug commented 2 years ago

In order to keep track of the performance of the library, we should implement performance tests.

It would also allow us to properly assess whether a change actually improves the performance or not.

A first approach we could choose would be to write tests, with proper fixtures to make the test runs independant, using the pytest-benchmark extension. This would make running the tests a slower but would allow us to easily benchmarks certain parts of the code without writing too much extra code. Of course we should use pytest markers to prevent running such tests by default.

A second approach would be to use airspeed-velocity, a tool for benchmarking Python packages over their lifetime. Runtime, memory consumption and even custom-computed values may be tracked. This would require us to write a bit of code and to ideally keep it in a separate repository, but it would give allow us to track the changes overtime and to track multiple things more easily.

For example, Dask is using it and storing the code and results in this repository.

mdbenito commented 1 year ago

I think this is becoming more important: different stopping criteria, a joblib backend #276, an increasing number of algorithms...