Add benchmarking tests (close #300)

mscwilson commented 2 years ago

For issue #300.

Java comes with a benchmarking tool called Java Microbenchmarking Harness, JMH. It allows relatively accurate measuring of parts of the code. I integrated it as part of the main project build.gradle, and was able to measure the average time for Tracker.track(PageView) to complete, in the local 0.12.0-alpha.0 version. It takes a very consistent 1 microsecond.

That's not the time taken to send the event to the mocked HttpClientAdapter - that happens asynchronously, so Futures would be needed to measure that. Since we have had no reports of performance problems, that is almost certainly overkill.

I then wanted to measure the speed of Tracker.track(PageView) in the previous version 0.10.1 and 0.11.0. This has been problematic in multiple ways.

Problem 1: to interrupt the tracker's threads at the end of each test, jmh calls tracker.close(). This is a new method added in #298. Therefore to test previous versions, this has to be changed to emitter.close().

Problem 2: I had set up JMH manually as a gradle task, defining the tracker as a dependency (jmhImplementation 'com.snowplowanalytics:snowplow-java-tracker:0.11.0'. This dependency was required - removing it caused jmh to fail because it couldn't resolve any of the Snowplow components. I added this print statement at the beginning of the test run.

@Setup
public void doSetUp() {
    System.out.println("Using tracker version: " + tracker.getTrackerVersion());
}

It was always tracker version: 0.12.0-alpha.0 i.e. my local version currently being worked on. Presumably the Java compiler is somehow using both versions - local and from mavenCentral - but preferentially using the local one. So it was impossible to measure an old version.

Therefore, I created a brand new Gradle project inside the examples folder, and set up JMH again, this time using the popular JMH gradle plugin. I can now use mavenLocal and mavenCentral to resolve the snowplow-java-tracker version of my choice. Btw, Gradle strongly recommends never using mavenLocal.

Problem 3: When running the benchmarking on version 0.10.1 or 0.11.0, it gradually gets slower and slower for each test, until I have to manually cancel the run after a few minutes wait. So the first few iterations will take e.g. 500 ns, which will increase to e.g. 0.4 ms before I stop it.

Perhaps it's not worth measuring the old versions after all? We could just start benchmarking from this version onwards.

mscwilson commented 2 years ago

Currently, the JMH additions are present in both the main project and the separate project. I will delete one set once we decide what to do.

mscwilson commented 2 years ago

I've improved the JMH code so that it recreates and then close()s the tracker/emitter for every iteration. This has fixed a large part of the memory problems (Problem 3). I can now run the benchmarking test successfully for version 0.11.0, with intermittent "out of heap space" errors for version 0.10.1.

Edit: to be clear, all the following tests were performed on my MacBook Pro M1.

The benchmarking test measures the average time to complete `Tracker.track(pageView)`, over 500ms of testing. The results:	Tracker version	Time (mean), ns/op	S.D.
0.10.1	390.138	32.9	60
0.11.0	375.125	27.2	100
0.12.0-alpha.0	972.818	28.2	100

There were only 60 tests for version 0.10.1 because two of the forks were dropped due to "out of heap space".

I also tested using JMH's singleShotTime setting, which is supposed to measure the time for a single operation. The error was huge - some standard deviation results bigger than the mean. It claimed that a single attempt in version 0.12.0-alpha.0 took ~200 +- 70 microseconds. This obviously is at odds with the averageTime ns/operation result of ~1 microsecond.

I also tested manually using `System.nanoTime()`. The code is not uploaded here. I adapted the simple-console demo to create an ArrayList containing two of each of the six event types. I then looped over the list 200 times, calling `Tracker.track()` on each event (24000 events tracked in total), and measured how long it took. NB these are ms, not ns.	Tracker version	Time (mean), ms	S.D.
0.10.1	69.2	19.2	5
0.11.0	49.2	12.8	5
0.12.0-alpha.0	134.6	18.7	5

I don't think it's worth reading anything into the specific times measured: we can only compare the relative differences between the tracker versions. It's clear that the new version of the tracker is twice as slow as previous versions.

AlexBenny commented 2 years ago

These numbers are interesting. The fact that the new version is slower is quite expected. Being slower 2 or 3 times might not be a problem, after all we are talking of a microsecond. Mostly depends on how/where this tracker is used. I think much of the slowness is due to the work needed to instance a thread for each event. If we want to reduce this time we can adopt the same buffer approach before the tracker threadpool. In practice the track method puts the event in a LinkedBlockingQueue and the tracker threadpool consumes the queue. It should make the time spent on the track method almost the same as the v0.11.

@paulboocock, thoughts on this?

mscwilson commented 2 years ago

We've decided to merge this in now, and revisit the discussion of a producer/consumer queue for the Tracker later.

snowplow / snowplow-java-tracker

Add benchmarking tests (close #300) #301