Better benchmarks for match checking?

Nadrieril commented 3 years ago

The benchmark suite includes two benchmarks that stress match checking (and I assume match codegen too): unicode_normalization and match-stress-enum. The problem with these two is that they really are not representative of matches in the wild: they both have a billion branches and a single non-nested pattern per branch, whereas real-life matches have only a couple of branches and nested patterns. When I'm working on the exhaustiveness algorithm I would love to have a benchmark that has more representative examples, but I wouldn't know how to gather that. Like maybe every ? generates a match and that's the most common form? Or maybe derives or other macros? I was wondering if anyone here would know how to collect those kinds of tests, or would have some time to look into that.

Mark-Simulacrum commented 3 years ago

It might work well to add a test that (for example) has 1,000 small matches, or something like that, in separate functions -- but ultimately, the reason why the current stress test is like this I suspect is because I would (knowing essentially nothing about the match code) expect that the code's runtime is some function of the number of patterns -- so, the more patterns, the more we make the interesting bits hot and make it easier to detect an accidental exponential 2^n loop introduction or something like that.

That said I'm quite open to modifying/adding new benchmarks, though I don't really know what those would necessary look like :)

nnethercote commented 3 years ago

That said I'm quite open to modifying/adding new benchmarks, though I don't really know what those would necessary look like :)

I suggest basing benchmarks on real-world code wherever possible.

unicode_normalization is good in this way. I would call it a "microbenchmark".
match-stress-enum is not good. I would call it an "artificial stress test".

Nadrieril commented 3 years ago

As an interesting data point, https://github.com/rust-lang/rust/pull/79394 enables a features that slows down match exhaustiveness noticeably. The resulting perf run therefore give a good idea of which benchmarks use the algorithm the most. Knowing the algorithm I'd say this slowdown is representative of how many times it actually goes through the "main loop", with type/pattern depth having a quadratic impact. That's good because I wanted to benchmark deeper patterns. So looks like the syn and style-servo benches could be milked for representative deep matches. If anyone finds the time... :angel: EDIT: did it myself :)

dralley commented 3 months ago

HTML predefined entity resolution is a decent representative benchmark for this, I think. With this one function enabled, compilation takes 30 seconds, compared to 3 seconds without it.

https://github.com/tafia/quick-xml/issues/763

hanna-kruppe commented 3 months ago

The match in question is here https://github.com/tafia/quick-xml/blob/5629ac75164793aeb8ab3bf8bb15d13ea9b29e00/src/escape.rs#L335

It's certainly a real-world example of the "huge match statement takes forever to compile" class of problems, but I'm not sure if it's exactly what this issue is looking for. It's huge but exhaustiveness is ~~trivial~~ amendable to fast heuristics (matches on &[u8] and has a top-level catch-all arm), the real problem is LLVM passes being quadratic or worse on large CFGs more so than anything in rustc. For a check build of quick-xml v0.32.0, -Zself-profile attributes ca. 5% (14.25ms) to check_match, which I assume does match checking. Of course, it might still be interesting for guarding against regressions or for reasons unrelated to match checking (e.g., MIR passes that can also struggle with huge CFGs).

Kobzol commented 3 months ago

@Nadrieril What do you think? Maybe we could add more cases to match-stress, if this situation is uniquely different than what is already there.

Nadrieril commented 3 months ago

Actually yeah, the two stress-tests we have are on patterns with 0 depth (fieldless enum variants and chars respectively), whereas this one is on slices so it tests the actual insides of the algorithm. Not what I was asking for but interesting to have, for match lowering too I'd guess.

Nadrieril commented 3 months ago

If we change match-stress I'd request we also reduce the size of the enum match. It's ridiculously big and hitting weird edge cases that aren't reproducible from machine to machine. A few hundred cases should be more than enough.

Kobzol commented 3 months ago

We will probably do a large benchmark update in the upcoming months, since it has almost been three years since the last update, and the current benchmarks generate a lot of linting errors. We can do that as a part of that update.

rust-lang / rustc-perf

Better benchmarks for match checking? #792