gunnarmorling / 1brc

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
https://www.morling.dev/blog/one-billion-row-challenge/
Apache License 2.0
6.08k stars 1.83k forks source link

improve hard disk access locality, another 8% #591

Closed abeobk closed 7 months ago

abeobk commented 7 months ago

Check List:

Last version: Benchmark 1: ./calculate_average_abeobk.sh Time (mean ± σ): 611.7 ms ± 10.8 ms [User: 1.6 ms, System: 6.3 ms] Range (min … max): 597.6 ms … 638.7 ms 10 runs

This version: Benchmark 1: ./calculate_average_abeobk.sh Time (mean ± σ): 561.8 ms ± 6.6 ms [User: 2.1 ms, System: 1.7 ms] Range (min … max): 544.4 ms … 569.8 ms 10 runs

This runs faster on my machine, but I'm not sure if it will have the same performance on the evaluation one. This version processes data in small chunks, each 4MB in size, thereby improving hard disk access locality.

gunnarmorling commented 7 months ago

Nice, a tad faster:

Benchmark 1: timeout -v 300 ./calculate_average_abeobk.sh 2>&1
  Time (mean ± σ):      2.146 s ±  0.005 s    [User: 0.002 s, System: 0.004 s]
  Range (min … max):    2.136 s …  2.155 s    10 runs

Summary
  abeobk: trimmed mean 2.1465672555650004, raw times 2.13622836594,2.1448431559400003,2.14909065494,2.1479434199400003,2.14607579794,2.1437134649400003,2.1473469029400003,2.14786259294,2.1456620549400003,2.1554339599400003

Leaderboard

| # | Result (m:s.ms) | Implementation     | JDK | Submitter     | Notes     |
|---|-----------------|--------------------|-----|---------------|-----------|
|   | 00:02.146 | [link](https://github.com/gunnarmorling/1brc/blob/main/src/main/java/dev/morling/onebrc/CalculateAverage_abeobk.java)| 21.0.2-graal | [Van Phu DO](https://github.com/abeobk) | GraalVM native binary, uses Unsafe |