My own implementation of the 1BRC but in Rust.
Test are run under WSL2 (so using Hyper-V).
Hardware:
Before running anything. There is a huge performance difference when data are in memory cache or not. So for every result, I will separate first run and real benchmark.
NOTE: I am cleaning memory cache using following command as root:
sync; echo 1 > /proc/sys/vm/drop_caches
Few utilities to deal with this can be found under the utils directory.
I do not have GraalVm, so I will only run the best pure Java implementation.
# First run:
❯ \time -f 'Elapsed=%E' ./calculate_average_yourwass.sh > /dev/null
Elapsed=0:03.14
# After few runs (so cache is now warm):
❯ \time -f 'Elapsed=%E' ./calculate_average_yourwass.sh > /dev/null
Elapsed=0:00.93
# With a RAM disk performances are constant but a little bit slower than pure memory cache.
❯ \time -f 'Elapsed=%E' ./calculate_average_yourwass.sh > /dev/null
Elapsed=0:01.07
# First run:
\time -f 'Elapsed=%E' target/release/one-brc-rs > /dev/null
Inside main total duration: 2.781221427s
Elapsed=0:02.96
# After few runs (so cache is now warm):
\time -f 'Elapsed=%E' target/release/one-brc-rs > /dev/null
Inside main total duration: 705.814095ms
Elapsed=0:00.81
# With a RAM disk performances are constant but a little bit slower than pure memory cache.
❯ \time -f 'Elapsed=%E' target/release/one-brc-rs > /dev/null
Inside main total duration: 892.661661ms
Elapsed=0:00.97
What works:
fxhash::hash64(...)
only once and use raw hashbrown::HashTable
.munmap
the data a let the OS releasing the memory.What does not work:
strace
we can check no allocation are done after file is memory mapped.
Tests have been done with mimalloc and jemalloc.rayon
crate does the job pretty well.mmap
flags.Ideas:
HashMap
?