How we Visualize Benchmarks

astropy / astropy-benchmarks

Benchmarks for the astropy project

https://spacetelescope.github.io/bench/astropy-benchmarks/

BSD 3-Clause "New" or "Revised" License

7 stars 27 forks source link

How we Visualize Benchmarks #117

Open nstarman opened 6 months ago

nstarman commented 6 months ago

Throwing out a suggestion. Rather than ASV I recommend https://docs.codspeed.io/. codspeed is really good, free for open-source, and integrates deeply with pytest (see pytest-benchmark). With codspeed + pytest ecosystem it's easy for use to have a specific benchmark test suite and also have benchmarked tests in the normal test suite.

Cadair commented 6 months ago

you lost me at "free for open source" I will generally push back against anything which isn't actually open source, especially if it's a hosted service at the whims of VC (I am assuming).

nstarman commented 6 months ago

you lost me at "free for open source"

Like many of GH's tools? Or RTD? Or Pre-commit.ci? 😁 I think most of our CI is in the category "free for open source".

matteobachetti commented 6 months ago

you lost me at "free for open source"

Like many of GH's tools? Or RTD? Or Pre-commit.ci? 😁 I think most of our CI is in the category "free for open source".

Yes, but also like Travis CI 😜 I see what @Cadair says, maybe I'm not so negative about it, but "free for open source" is not really something that screams trustworthiness these days

nstarman commented 6 months ago

I'm just not sure where we would go from there. We should definitely vet our tools. But if we fundamentally don't trust anything in industry then we shouldn't use GH Actions under the free-for-open-source tier (we do), RTD under the free-for-open-source tier, Pre-commit under the same, encourage new users to use GH virtual development environments, CircleCI, etc. Our actual inert code is one of the few things that isn't in a free-for-open-source tier. And Codspeed is kind of like codecov, in that it is a user-friendly layer on top of open-source tooling, like pytest-benchmark. Now, maybe we vet it and don't like it, but that hasn't happened yet.

hamogu commented 6 months ago

We actually pat for ReadtheDocs because we think they need the money, plus we wanted something fomr them, but I now forget what. Details are here https://github.com/astropy/astropy-project/issues/105

hamogu commented 6 months ago

Personally, I'm less worried about something like codespeed.io going away. It might be helpful, but it's not critical. If it goes away, we're just where we are now without regular benchmarks. That's different from e.g. github itself; while we could move to bitbucket or gitlab, that would be a lot more disruptive. So the question is how much effort it would be to set up. If someone can set it on in 4 h, and they start charging, we only loose 4 h of work. If it's a lot more effort to set up an maintain, then we should be more careful in vetting.

nstarman commented 6 months ago

https://docs.codspeed.io/#how-long-does-it-take-to-install

If you're already benchmarking your codebase, you can plug your existing benchmarks into CodSpeed in less than 5 minutes since CodSpeed benchmark's API is compatible with the most popular benchmarking frameworks(pytest-benchmark, bencher, criterion.rs, vitest, tinybench, benchmark.js).

pllim commented 6 months ago

I don't think this replaces the part where we run the benchmark for every commit on main etc? Looks like it only does PR continuous integration. I would like a more technical blog on the pros and cons, and what projects have used this service to-date and what lessons are learned.

nstarman commented 6 months ago

https://news.ycombinator.com/item?id=36682012 https://blog.pydantic.dev/blog/2022/07/10/pydantic-v2-plan/#performance

codspeed was one of the tools used by Pydantic v2 to achieve their 17x speedup.