gunnarmorling / 1brc

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
https://www.morling.dev/blog/one-billion-row-challenge/
Apache License 2.0
6k stars 1.8k forks source link

Gamlerhart Last Update: Disabling GC #636

Closed gamlerhart closed 6 months ago

gamlerhart commented 7 months ago

Check List:

I stop myself now =).

Thanks again for the learning experience. I've summarized my leanings here: https://www.gamlor.info/posts-output/2024-01-12-one-billion-row-challenge/en/

gunnarmorling commented 7 months ago

Runs out of heap space for the 10K keyset on 32 cores (see create_measurements3.sh):

Terminating due to java.lang.OutOfMemoryError: Java heap space
gamlerhart commented 7 months ago

Aha...Tried to simulate at 32/64thread machine. Should pass the 10K set now =).

gunnarmorling commented 7 months ago

Have you tried with 10K keys? It shows differences:

+ timeout -v 300 ./test.sh gamlerhart measurements_10K_1B.txt
Validating calculate_average_gamlerhart.sh -- measurements_10K_1B.txt
WARNING: Using incubator modules: jdk.incubator.vector
227c227
< Burl;-20.6;9.0;48.2
---
> Burl;-20.6;8.6;37.9
376c376
< Edmo;-17.5;10.7;40.4
---
> Edmo;-17.5;10.6;40.4
700c700
< Lian;-17.4;12.6;44.2
---
> Lian;-15.3;13.1;44.2
871c871
< Noga;-17.8;18.6;49.1
---
> Noga;-10.1;19.9;49.1
1306c1306
< Y;-14.2;15.4;44.6
---
> Y;-14.2;15.4;44.7
1315c1315
< Yogy;-16.5;14.8;47.4
---
> Yogy;-16.5;14.7;47.4
1326c1326
< Zagr;-15.7;14.2;45.8
---
> Zagr;-15.7;14.0;43.2
1769,1770c1769,1770
< afang;-19.7;10.3;38.1
< afangd;-17.2;13.5;48.1
---
> afang;-19.7;10.3;41.5
> afangd;-12.7;15.5;48.1
1890c1890
< akaJ;-18.3;14.4;47.8
---
> akaJ;-13.3;15.6;47.8
2011c2011
< am;-9.5;19.1;47.8
---
> am;-9.5;19.1;50.1
2021c2021
< amPūnchLaurelTavernySui;-19.8;12.6;50.1
---
> amPūnchLaurelTavernySui;-19.8;11.0;41.1
2129c2129
< anSlidellBayan HotGreenburghShibat;-17.4;15.5;47.5
---
> anSlidellBayan HotGreenburghShibat;-17.4;15.3;47.5
2688c2688
< b;-14.8;13.5;42.4
---
> b;-16.5;13.5;43.4
2791c2791
< bovyanColcapirhuaBagneuxClondalkinNorth BethesdaAzemmourSuttonGbadoliteWest Bridgfor;-16.5;18.7;49.7
---
> bovyanColcapirhuaBagneuxClondalkinNorth BethesdaAzemmourSuttonGbadoliteWest Bridgfor;-11.2;19.2;49.7
3128c3128
< dmaRagusaLebanonSomer;-19.7;16.8;47.3
---
> dmaRagusaLebanonSomer;-12.9;17.1;47.3
3311c3311
< eMasrakhBiarritzR;-14.9;13.4;43.5
---
> eMasrakhBiarritzR;-14.9;13.6;43.5
3324c3324
< ePes;-21.6;9.2;46.2
---
> ePes;-21.6;8.9;36.5
3334c3334
< eRangsitBeledw;-14.7;15.2;43.2
---
> eRangsitBeledw;-14.7;15.3;43.2
3752c3752
< esJa;-12.7;17.0;46.7
---
> esJa;-12.7;16.8;45.5
4175c4175
< hHng;-19.4;13.6;50.6
---
> hHng;-19.4;12.6;41.5
4455c4455
< i;-13.7;16.6;46.7
---
> i;-14.4;16.6;46.7
4838c4838
< inUm;-18.0;15.0;44.0
---
> inUm;-18.0;14.7;41.9
5177c5177
< k;-9.3;19.1;47.7
---
> k;-9.3;19.1;49.9
5247c5247
< kbyC;-19.0;12.8;49.9
---
> kbyC;-19.0;12.2;45.5
5338c5338
< l;-9.3;19.8;49.4
---
> l;-9.3;19.8;49.8
5462c5462
< landsIsioloAnte;-21.4;8.8;41.2
---
> landsIsioloAnte;-21.4;8.7;41.2
5718,5719c5718,5719
< lpr;-20.1;14.0;49.8
< lpu;-14.2;17.2;49.0
---
> lpr;-20.1;11.6;40.2
> lpu;-14.2;17.0;45.7
6090,6091c6090,6091
< nMach;-17.4;14.6;45.4
< nMachengAkur;-14.3;18.6;49.5
---
> nMach;-17.4;14.6;48.7
> nMachengAkur;-9.1;19.8;49.5
6168c6168
< na;-13.9;15.8;44.4
---
> na;-14.0;15.8;44.4
6181c6181
< naSo;-22.9;8.5;39.0
---
> naSo;-22.9;8.3;37.5
6295c6295
< neSoissonsMirganjAïn;-17.0;14.5;48.8
---
> neSoissonsMirganjAïn;-14.8;14.7;48.8
6884c6884
< olSarıyerSumqayıtŌts;-20.4;9.9;41.3
---
> olSarıyerSumqayıtŌts;-20.4;9.6;41.3
7083,7084c7083,7084
< org;-9.6;20.0;48.5
< orge;-13.3;19.3;49.1
---
> org;-9.6;20.0;48.6
> orge;-13.3;19.1;49.1
7186c7186
< ouTây Ni;-14.5;19.0;48.3
---
> ouTây Ni;-13.7;19.6;48.3
7511c7511
< raS;-18.6;11.4;41.1
---
> raS;-18.6;11.3;41.1
7598c7598
< rdh;-21.1;12.7;44.1
---
> rdh;-21.1;12.6;42.6
7600c7600
< rdu;-23.9;10.3;44.9
---
> rdu;-23.9;9.8;38.3
7603c7603
< re;-8.7;19.9;47.4
---
> re;-10.3;19.9;50.1
7616c7616
< ree;-18.9;14.7;46.3
---
> ree;-18.9;14.2;43.3
7622c7622
< ren;-12.9;19.2;51.7
---
> ren;-12.9;19.0;51.7
7629c7629
< rer;-16.0;14.4;44.9
---
> rer;-16.0;14.3;42.2
7667c7667
< rha;-23.5;7.4;43.6
---
> rha;-23.5;7.0;33.7
7797c7797
< roPointe-à-PitreKuala TerengganuCampecheByatarayanpurMaw;-11.7;17.2;47.6
---
> roPointe-à-PitreKuala TerengganuCampecheByatarayanpurMaw;-11.7;17.4;47.6
7998c7998
< sCa;-15.0;19.5;51.7
---
> sCa;-9.6;19.9;51.7
8009c8009
< sFa;-13.0;19.1;49.6
---
> sFa;-13.0;19.0;49.6
8105c8105
< seLikasiDengtacunLembokZhijiangChengjiaoBeipiaoSuoluntunStaten IslandKota BharuCiudad LópezM;-15.3;17.9;47.8
---
> seLikasiDengtacunLembokZhijiangChengjiaoBeipiaoSuoluntunStaten IslandKota BharuCiudad LópezM;-9.6;18.0;47.8
8858c8858
< uenR;-15.9;14.2;47.1
---
> uenR;-15.9;14.3;47.1
9007c9007
< upaLji;-15.2;16.6;51.3
---
> upaLji;-15.2;16.9;51.3
9047c9047
< uraCiudad del Car;-14.6;14.7;45.8
---
> uraCiudad del Car;-14.6;14.2;44.0
9051c9051
< uramCa;-24.7;9.0;43.3
---
> uramCa;-24.7;8.5;43.3
9542c9542
< zinoAcámbaroMaisons-AlfortLechangBolgatangaMa;-10.6;17.6;47.5
---
> zinoAcámbaroMaisons-AlfortLechangBolgatangaMa;-10.6;17.7;47.5
9732c9732
< ābādBa;-11.9;18.8;49.0
---
> ābādBa;-11.9;19.1;49.0

FAILURE: ./test.sh gamlerhart measurements_10K_1B.txt failed
gamlerhart commented 7 months ago

Thanks for the command. I wasn't aware of the new 10k test set. I hand a bug indeed. Should be finally fixed.

The performance on the 10k 1billion set is bad =). 50+ seconds on my machine. Mostly likely because I was lazy in the comparison function and did not vectorize it for larger key. And my horrible hash may produce to many collisions with more keys as well =)

Anyway, I don't plan to fix that and stop myself here.

Thanks for all the effort.

gunnarmorling commented 6 months ago
# Result (m:s.ms) Implementation JDK Submitter Notes
00:04.474 link 21.0.1-open Roman Stoffel

Nice!

gunnarmorling commented 6 months ago

Passes the 10K test too.