Closed CameronAavik closed 8 months ago
I will update the results soon with your numbers. It's an impressive result, but more details later.
I've accidentally deleted measurement.txt file when running evaluate2.sh from the Java repo. After I regenerated the file I get segmentation fault from your binary, so there must be some bug. Tried to regenerate two times.
Fixing that bug could change the timing slightly, even if you add a single if
somewhere. So I cannot proceed now.
Someone else reported a similar issue on my repo and it was because I suspect they were running it on a file smaller than 16MB, is the segfault happening on a 1 billion row input? I have tried out running against 10 different 1 billion row inputs now and none of them are causing a segfault.
fyi I just noticed I had a bug in how I was printing measurements and have updated the repo, it shouldn't have any impact on performance, but yeah I can't seem to be able to repro the segfault unfortunately, I also built my project on WSL Ubuntu in case it was a Linux issue and have not been able to reproduce it there either, even with an input that uses 10,000 cities from CreateMeasurements3. On a side note, somehow on WSL Ubuntu it runs in 0.8s on my machine, much faster than my Windows timings!
It just does not like symlinks.
I have pushed a fix now that makes it work with symlinks, it turns out that new FileInfo(filePath).Length
for a symlink returns the size of the symlink and not the size of the file it links to
Added one more small improvement now to reuse the file handle, runs in 0.68s on WSL Ubuntu for me now
It was a great work from you!
It was so interesting to see a completely different code so close in perf 😄
As per your recommendation on twitter, opening an issue here to see if you can add my solution to your .NET results comparison table: https://github.com/CameronAavik/1brc
I'm using Windows but according to the total process time my solution seems to be ~0.87s faster 2.22s -> 1.35s, and according to
Stopwatch
inside the program it is ~0.11s faster 1.45s -> 1.34s.