voxpupuli / json-schema

Ruby JSON Schema Validator
MIT License
1.54k stars 243 forks source link

Performance regression tests #218

Open treppo opened 9 years ago

treppo commented 9 years ago

I think this project would benefit from one or multiple tests, that catch performance regressions. At work we use this gem in a few services and plan to use it even more. Unfortunately we have to use certain commits and cannot just use the current head, because we discovered, that often pull requests introduce significant performance issues.

My idea was that in order to be able to catch a regression, we would need a test, that compares revisions with benchmarks for representative operations – validations with non–trivial schemata with current master and the commits in the pull request.

What do you all think?

iainbeeston commented 9 years ago

Yeah it's been discussed, but now that we're using minitest there's no reason why we couldn't add some performance tests using minitest/benchmark (that would be my preference rather than comparing the performance between different branches).

Can you share the commits that cause performance problems for you? That would be helpful in resolving your issues, and it might also help us to compose tests to stop it happening again.

iainbeeston commented 9 years ago

Also, any examples of operations that perform badly in the latest release would be helpful.

treppo commented 9 years ago

Using Minitest::Benchmark sounds like the right tool for this kind of testing. I suggested the comparison of branches as a means to catch performance regressions. The only alternative approach I can imagine would be to store the benchmark results in a file for reference by later tests. Then it might become a problem to get comparable results as these depend on the performance of the system, that is running the tests. I think it is important to compare two versions (e.g. master vs. branch) at the same time on the same machine to get a meaningful result and catch regressions.

We ran into performance problems with schema validations in our live systems and as a result of performance analysis I opened a pull request for this commit. A week later we wanted to update the gem but discovered that performance had decreased again. We didn't really investigate further, but fixed the bundle to the commit instead (it was days before Christmas, so no time as usual).

I plan on spending some time to implement this benchmark suite and ideally also some logic (most likely a rake task) that allows to dig deeper into performance profiling – using stackprof and flamegraphs – so that it gets easier to find slow code.

iainbeeston commented 9 years ago

That sounds great. If I get a chance I'll look into what changes might have caused a slowdown (but unfortunately my time is limited at the moment)

treppo commented 9 years ago

I got fired from my company yesterday for the most stupid reasons (i can't recommend to work for wimdu.com), so I will not be able to continue on this at work. I will try to continue on this once I find another job.

iainbeeston commented 9 years ago

Sorry to hear that.

If you can give a rough description of any schemas that performed badly, we can probably make a start on reproducing it and diagnosing the problem for you.

iainbeeston commented 9 years ago

A couple more things that might help with benchmarking:

  1. https://github.com/Muscula/json-schema-benchmark/
  2. https://github.com/evanphx/benchmark-ips