Add benchmark for 3.0.0-rc1

Added benchmark test for the latest version 3.0.0-rc1.

Introduction of the new parameter optimizeForShortStreams in BeginSessionFor made it impossible (or, at least very much non-trivial) to compare 3.0.0-rc1 with prior versions 3.0.0-alphaX or 2.x directly, using the same BenchmarkDotNet project.

Benchmark results from the latest version 3.0.0-rc1 can be cross-referenced with results for previous versions manually.

Results

`3.0.0-rc1`

3.0.0-rc1 results

Full report

Previous versions

Major differences in 3.0.0-rc1 compared to previous versions:

3.0.0-rc1 allows to explicitly optimise for desired mode of operation by using optimizeForShortStreams parameter
3.0.0-alpha24 was optimised for long streams
3.0.0-alpha21 was optimised for short streams

Note that 3.0.0-rc1 and 3.0.0-alpha24 have some other optimisations, e.g. reducing the number of network calls to EventStore server. These optimisations however may not directly affect time spent in loading stream or memory allocated, so these improvements will not affect BenchmarkDotNet results.

Results can be interpreted using following categories:

Short event streams, `optimizeForShortStreams=true`

In this category 3.0.0-rc1 performance is very close to 3.0.0-alpha24, memory allocations numbers are very close, time spent is slightly better in 3.0.0-rc1 (1.3ms) compared to 3.0.0-alpha24 (1.5ms).

Short event streams, `optimizeForShortStreams=false`

In this category 3.0.0-rc1 performance is better than 3.0.0-alpha21: e.g. for stream length 20 time spent reduced from 2.0ms to around 1.2ms; memory allocations reduces from 270KB to 210KB.

Long event streams, `optimizeForShortStreams=true`

In this category 3.0.0-rc1 performance is very close to 3.0.0-alpha21, both in terms of time spent and memory allocations.

Worth noting that same as with 3.0.0-alpha21, when loading long streams we see large number of GC Gen1 and Gen2 collections, so memory pressure is high.

Long event streams, `optimizeForShortStreams=false`

In this category 3.0.0-rc1 performance is very close to 3.0.0-alpha24: memory allocations are pretty much identical; slightly longer time spent loading stream - 813ms for 3.0.0-rc1 vs 756ms for 3.0.0-alpha24.

Comparing versions prior to the `3.0.0-rc1`

3.0.0-alpha24 results

Full report for previous versions

BullOak / BullOak.Repositories.EventStore