Confirm new JFR profiling is adding minimal overhead

mikemccand commented 3 years ago

I have largely trusted that enabling JFR does not harm performance much, as long as you don't configure overly aggressive sampling.

But I've seen internal evidence (Amazon product search, closed source) that maybe JFR might hurt red-line QPS non-trivially, even when using the same .jfc configuration as we use here in luceneutil.

Let's try enabling/disabling JFR and confirm the overhead is not too bad (<= 1%?)?

rmuir commented 3 years ago

Agreed this is important to do.

The stuff in the profiling.jfc from the apache repo was geared at testing, where you opt-in with the tests.profile parameter. But i have a 2-core machine so i wanted to keep overhead low :) Still, it may not be appropriate for benchmarks.

So, for example, i tweaked jdk.ExecutionSample and jdk.NativeMethodSample from their defaults of 10/20ms to 1ms. I also spent some time (maybe not enough?) to try to see if more finer-grained such as microseconds was possible :)

If there are perf issues, maybe try to see if these can be relaxed. For the benchmark, I think we could improve the profiling quality in other ways instead?

In particular, if we could separate the profiling output "by-query". Maybe I am confused, if i look at booleanquery profile, i see vectors stuff? Or is that some up-front shenanigans in the benchmark engine unrelated to the test https://github.com/mikemccand/luceneutil/issues/77#issuecomment-758752817 ? Either way, its confusing :)

msokolov commented 3 years ago

If the "vectors stuff" is related to VectorDictionary, then yeah it is setup code. We could improve the situation by using a better (on-disk) format like FST for that, then we wouldn't have to load into RAM during setup. If it's not VectorDictionary, then it's a bug though. Also, is there a way to enable sampling only the query threads? Or only analyzing samples from those threads?

mikemccand commented 3 years ago

We might also be able to only turn on JFR after all the initialization is done so we only see the "long running queries" type of hot spots.

mikemccand / luceneutil

Confirm new JFR profiling is adding minimal overhead #105