Document how to run benchmarks

christian-esken commented 7 years ago

I wanted to run the Caffeine benchmarks, to check the latest changes in the triava cche changes. I have some feedback about the Caffeine documentation. I did not know where you would like to put those, so I did not to a PR (yet). I can do that if I get directions how to proceed.

The required gradle version should be documented, as it is hard to find the correct one. I ran with gradle3, then gradle2, then the newest gradle4.1 without success. I was successful with gradle 4.0.2.
A bit of information on how to run the benchmarks would be helpful. I have not found any in the README or the BENCHMARK page. I figured out that I have to run something like this:

gradle-4.0.2/bin/gradle '-PincludePattern=.*' jmh

Two questions:

I understood that -PincludePattern is the "include" parameter for the jmh gradle plugin. Right?
Is there a way to run the test for a specific set of Caches, e.g. Caffeine + triava ? (edit: It looks like the tested caches are a parameter in GetPutBenchmark and EvictionBenchmark)

ben-manes commented 7 years ago

This is the magic of the Gradle wrapper, you don't need to figure it out yourself. As the JavaDoc for the GetPutBenchmark states,

./gradlew jmh -PincludePattern=GetPutBenchmark

In jmh.gradle it has the plugin settings. It states that you can use benchmarkParameters as...

Benchmark parameters: Seperated by '&' for parameter types, and ',' for multiple values

I'm usually lazy and edit the Java file locally when exploring, though.

Some of the benchmarks are for checking for any performance issues or experimentation, rather than outside use. Since benchmarks are often wrong, misunderstood, manipulated, etc. I prefer minimalism to not over state (or imply), misreport by over complicating, etc in those that I describe. That's why you'll see just the two as I can explain them fairly, even though there are many other details that might be of interest. Those can be more difficult to explain without fear of misunderstandings, so best left to development and not to confuse users. For users showing if it is fast enough to meet their performance budgets is the primary goal.

christian-esken commented 7 years ago

I am mainly interested in core benchmarks that can be run for most caches, like GetPutBenchmark. Once knowing the name of the class (GetPutBenchmark) it gets easier. I am not yet familiar with the gradle jmh plugin to spot that benchmarkParameters is applied to \@Param. Thanks for clarifying, this makes it very easy. What do you think of mentioning the gradle 4.0 requirement?

ben-manes commented 7 years ago

I don't think it needs to be mentioned if you use the wrapper. That installs the version for the project defined in gradle/wrapper/gradle-wrapper.properties. Since that is the idiomatic way to run Gradle, I think it's better since upgrades are fairly regular. Of course sometimes that breaks plugins and I don't always check jmh when updating, so if I broke that let me know and I'll dig into a fix.

christian-esken commented 7 years ago

I tried to run it within IntelliJ IDEA via "use default gradle wrapper" and that failed. Only when pointing IntelliJ to a local gradle 4.0.2, I got it working. Its not really worth keeping this issue open for that, though I personally tend to run gradle on server machines directly instead of the wrapper. I will be offline for a couple of days, and if you like you can close this issue. I will try to reproduce the gradle trouble when I get back, and will come back to you if I can reproduce it reliably.

ben-manes commented 7 years ago

Okay, I use Eclipse but try to keep things working for IntelliJ folks. I run tasks at the command line. Since I don't check jmh or IntelliJ regularly, if it's busted then I can look into a fix. I guess we can close, but not a big deal either way.

Also feel free to fork the benchmarks to bootstrap your own suite, e.g. as cache2k and collisions did.

ben-manes commented 7 years ago

I did a quick test on the train and saw that I broke the benchmarkParams at some point. That should be fixed now, sorry. I gave it a non-scientific run so the numbers are rough as I am on a laptop, using battery, with other stuff running.

./gradlew jmh -PincludePattern=GetPutBenchmark \
    -PbenchmarkParameters=cacheType=TCache_Lru,TCache_Lfu

Benchmark                                (cacheType)   Mode  Cnt         Score         Error  Units
GetPutBenchmark.read_only                 TCache_Lru  thrpt   10  30780838.627 ± 4638449.427  ops/s
GetPutBenchmark.read_only                 TCache_Lfu  thrpt   10  27041235.665 ±  853354.312  ops/s
GetPutBenchmark.readwrite                 TCache_Lru  thrpt   10  23615279.925 ± 3828354.093  ops/s
GetPutBenchmark.readwrite:readwrite_get   TCache_Lru  thrpt   10  19492469.512 ± 3034092.917  ops/s
GetPutBenchmark.readwrite:readwrite_put   TCache_Lru  thrpt   10   4122810.413 ±  877353.105  ops/s
GetPutBenchmark.readwrite                 TCache_Lfu  thrpt   10  28617429.557 ± 2738865.133  ops/s
GetPutBenchmark.readwrite:readwrite_get   TCache_Lfu  thrpt   10  24185446.587 ± 2056324.626  ops/s
GetPutBenchmark.readwrite:readwrite_put   TCache_Lfu  thrpt   10   4431982.970 ±  727174.167  ops/s
GetPutBenchmark.write_only                TCache_Lru  thrpt   10  17466518.407 ± 5662496.764  ops/s
GetPutBenchmark.write_only                TCache_Lfu  thrpt   10  19071051.318 ± 2420626.652  ops/s

Given that your approach is a high watermark with async evaluation, in the past I remember thinking the numbers looked like TCache might be thrashing on a CAS field somewhere. The score looked like that type of contention (lock contention is similar but different numbers). I never dug in to find where, but using a profiler should find it readily. JMH's stacktrace might work, or I have CacheProfiler as a little testing tool for attaching YourKit to. I'm sure if you fix that it will perform closer to a raw ConcurrentHashMap as expected.

christian-esken commented 7 years ago

You ran it with triava v1.0.3, right? There has been 2 performance issues that I fixed in v1.0.4. Still there are issues in the GET that I cannot explain, even using YourKit. I will try to see what your CacheProfiler is about. Performance should be close to ConcurrentHashMap at least for GET's, and I will benchmark that. Also I want to plugin another backing Map, for some fun and insight I thought I might try ConcurrentLinkedHashMap.

ben-manes commented 7 years ago

v1.0.4, so maybe jmh profilers will help. My ad hoc profiler is just for yourkit, but jmh's are more low level.

CLHM decorates CHM so it might not be too insightful, unfortunately.

christian-esken commented 7 years ago

FYI: I today tested plenty in YourKit with "CPU, sampling" and "CPU, tracing", but the results were misleading. Then I simply scanned code and you are right. I am thrashing on AtomicLong in the cache statistics. I moved to LongAdder, boosting the readonly benchmark from 30 Mio/s to 80 Mio/s. That is going in the right direction. :-)

ben-manes commented 7 years ago

Nice! I didn't mean intend to enable statistics and it wasn't obvious when I wrote the code that they were on by default. We should probably disable it in these benchmarks to be fair/comparable to others. Glad the default behavior is better now!

ben-manes / caffeine

Document how to run benchmarks #185