gunnarmorling / 1brc

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
https://www.morling.dev/blog/one-billion-row-challenge/
Apache License 2.0
6.08k stars 1.83k forks source link

last perf papercut #613

Closed iziamos closed 7 months ago

iziamos commented 7 months ago

Check List:

This was quite fun overall, learned alot. I'm just pushing this last tweak because I suspect its the lowest of the hanging fruit.

Farewell!

gunnarmorling commented 7 months ago

Doesn't make a significant difference on the eval env: 00:04.167. More importantly though, the entry doesn't finish for the 10K key set test (see create_measurements3.sh and results here). Could you take a look at this one?

iziamos commented 7 months ago

Hmm worth a look, I suspect its the hashcode I'd switched to, for context I wanted something where only the position in the string mattered to see if it would be trivially vectorisable. It turned out that vectors were to slow though :(

iziamos commented 7 months ago

@gunnarmorling I've updated the hashcode. Probably passes the 10k within a min (4 sec on my machine). But is probably slower overall :)

iziamos commented 7 months ago

Just did a squash, probably not worth merging if it performs worse on the eval box but just to demonstrate that its a hash collision issue on the 10k. :)

gunnarmorling commented 7 months ago

@iziamos, could you clarify: should this pass for 10K keys now, or is there still a potential hash collision? Correctness is the most important thing, so even if it is slower than what's on main right now but actually is more correct, I'd still merge it. Thx for clarifying.

iziamos commented 7 months ago

@gunnarmorling I believe it was always correct, this version is significantly faster in the 10k keys (4 sec vs sth like 4 min), but slower (on my machine) for the main challenge. I just made the change to demonstrate :)

In general I've found that the best way to evaluate solutions for hash collisions isn't to stress test the key count but to replace the hashcode implementation with one that returns a constant e.g. 1, and see that it eventually returns correct results.

gunnarmorling commented 7 months ago

Gotcha. So it's minimally slower for the standard key set (00:04.230), but indeed much faster for the 10K key set (00:06.712), so I'm gonna merge it, as this entry is amongst the top 20 or so for which I occassionally publish the 10K leaderboard and it DNF-ed there so far. Thx!