buybackoff / 1brc

1BRC in .NET among fastest on Linux
https://hotforknowledge.com/2024/01/13/1brc-in-dotnet-among-fastest-on-linux-my-optimization-journey/
MIT License
437 stars 43 forks source link

Add implementation from Cameron Aavik #9

Closed CameronAavik closed 8 months ago

CameronAavik commented 8 months ago

As per your recommendation on twitter, opening an issue here to see if you can add my solution to your .NET results comparison table: https://github.com/CameronAavik/1brc

I'm using Windows but according to the total process time my solution seems to be ~0.87s faster 2.22s -> 1.35s, and according to Stopwatch inside the program it is ~0.11s faster 1.45s -> 1.34s.

buybackoff commented 8 months ago

I will update the results soon with your numbers. It's an impressive result, but more details later.

I've accidentally deleted measurement.txt file when running evaluate2.sh from the Java repo. After I regenerated the file I get segmentation fault from your binary, so there must be some bug. Tried to regenerate two times.

Fixing that bug could change the timing slightly, even if you add a single if somewhere. So I cannot proceed now.

CameronAavik commented 8 months ago

Someone else reported a similar issue on my repo and it was because I suspect they were running it on a file smaller than 16MB, is the segfault happening on a 1 billion row input? I have tried out running against 10 different 1 billion row inputs now and none of them are causing a segfault.

CameronAavik commented 8 months ago

fyi I just noticed I had a bug in how I was printing measurements and have updated the repo, it shouldn't have any impact on performance, but yeah I can't seem to be able to repro the segfault unfortunately, I also built my project on WSL Ubuntu in case it was a Linux issue and have not been able to reproduce it there either, even with an input that uses 10,000 cities from CreateMeasurements3. On a side note, somehow on WSL Ubuntu it runs in 0.8s on my machine, much faster than my Windows timings!

buybackoff commented 8 months ago

It just does not like symlinks.

CameronAavik commented 8 months ago

I have pushed a fix now that makes it work with symlinks, it turns out that new FileInfo(filePath).Length for a symlink returns the size of the symlink and not the size of the file it links to

CameronAavik commented 8 months ago

Added one more small improvement now to reuse the file handle, runs in 0.68s on WSL Ubuntu for me now

buybackoff commented 8 months ago

It was a great work from you!

It was so interesting to see a completely different code so close in perf 😄