datalad / git-annex

A non-official clone of git-annex established for DataLad purposes. No PRs will be merged, but could be used to test perspective git-annex patches. Official git-annex repository: https://git.kitenet.net/index.cgi/git-annex.git/
17 stars 3 forks source link

somehow organize performance benchmarking of git-annex #3

Open yarikoptic opened 4 years ago

yarikoptic commented 4 years ago

We added some checks to git-annex build workflow to spot some cases which could lead to slow(er) standalone build operation, but overall we do not have a good way to detect whenever git-annex "slow downs". We can only see reflection of that whenever we try a new snapshot build sweeping through our datalad tests but then it becomes an archeological expedition to see which change brought the pessimization.

It would be nice to establish automated and consistent benchmarking of git-annex builds as pertinent to datalad.

Proposal:

WDYT @mih @kyleam @bpoldrack @jwodder

FYI @joeyh

joeyh commented 4 years ago

There is git-annex benchmark, which does a good job of benchmarking a git-annex command or sequence of commands you choose.

It can output to json or csv, which lets benchmarks be compared and a regression be flagged. At least in theory.. I don't have anything doing that. Output of git-annex benchmark whereis --csv foo.csv

Name,Mean,MeanLB,MeanUB,Stddev,StddevLB,StddevUB whereis,5.076051109441738e-2,4.914089405704959e-2,5.4101610224266704e-2,4.234978773428508e-3,2.050397769413241e-3,6.8122220186021265e-3

(But does not include startup speed in the benchmark currently. Could add an option to include that, or maybe better a mode that only benchmarks the startup speed.)

-- see shy jo