jump-dev / JuMP.jl

Modeling language for Mathematical Optimization (linear, mixed-integer, conic, semidefinite, nonlinear)
http://jump.dev/JuMP.jl/
Other
2.24k stars 396 forks source link

continuous performance testing #42

Open mlubin opened 11 years ago

mlubin commented 11 years ago

Codespeed?

IainNZ commented 11 years ago

Would be nice! More to detect errant Julia changes than our own, perhaps

joehuchette commented 10 years ago

Could we incorporate this into the Travis builds somehow?

mlubin commented 10 years ago

Not really, travis runs on shared VMs so it will be hard to get consistent results.

mlubin commented 9 years ago

Ping @jrevels, JuMP would benefit a lot from this

jrevels commented 9 years ago

Literally was just talking to folks at Julia Central about CI perf testing today, going to be experimenting with writing webhooks to do this in the coming week(s). I'll definitely keep you posted.

pkofod commented 7 years ago

pinging @mlubin @jrevels did you ever figure out how to do this in a clever way?

mlubin commented 7 years ago

@pkofod, there was never any substantial effort put into this

odow commented 2 years ago

This came up on Gitter today, so I did some investigating:

I don't think we want to run the benchmarks on every commit. That'd get a bit painful. We probably just want each commit to master and the ability to run on-demand for a PR.

For the benchmarks, we probably want:

This could all sit in a new repository (JuMPBenchmarks.jl) and push to a GitHub page with plots like

So in summary, I think we have a lot of what is needed. It just needs some plumbing to put together. There is also the question of dedicated hardware for this. But I can probably be persuaded to get a small PC to sit in the corner of my office as a space-heater during winter.

ericphanson commented 2 years ago

https://github.com/jump-dev/Convex.jl/tree/master/benchmark

This may have bitrotted unfortunately; we used the run benchmarks in CI, but I never remembered to look at the results (hidden in the Travis logs, at the time), so I removed it (or perhaps just didn’t replace it when we switched to GitHub Actions). It also slowed down CI a lot. That code was based off of @tkf’s, and he likely has better versions these days (maybe https://github.com/JuliaFolds/Transducers.jl/tree/master/benchmark).

So I agree also with not running it per-commit. Could be useful for it to be runnable on-demand in a PR like nanosoldier for Julia Base, so if you suspect a chance could cause a regression then you can trigger it.

It might be useful to look at how SciML does their benchmarks too: https://github.com/SciML/SciMLBenchmarks.jl. It looks also like there’s some “juliaecosystem” hardware; perhaps JuMP can get access too: https://github.com/SciML/SciMLBenchmarks.jl/blob/bda2ca650fd4fbd25e3bcdc0ddb4b43535bcd7b6/.buildkite/run_benchmark.yml#L50 (I’ve got no idea though).

tkf commented 2 years ago

FYI, there's a setting to run the benchmark with label. Take a look at the setting with if: contains(github.event.pull_request.labels.*.name, 'run benchmark') in https://github.com/tkf/BenchmarkCI.jl#create-a-workflow-file-required (thanks to @johnnychen94; ref https://github.com/tkf/BenchmarkCI.jl/pull/65)

As for my recent approach, I mostly moved to set up a benchmark suite for smoke test (e.g., take only one sample) and then invoking it from the test. It's not actually continuous performance testing but rather for just avoid breaking benchmark code. But I still find it useful.

odow commented 2 years ago

Ideally once JuMP 1.0 is released, we wouldn't have to worry about breaking any benchmarks. (And if we did, that's an indication that we've done something wrong!)

There are some Julia servers for the GPU and SciML stuff that host jobs on build kite (we use one for running the SCS GPU tests). Their benchmarks are pretty heavy though. I'm envisaging some much smaller runs, so we don't need a beefy machine.

odow commented 2 years ago

Made progress here: https://github.com/jump-dev/benchmarks

Dashboard is available at https://jump.dev/benchmarks/