gunnarmorling / 1brc

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
https://www.morling.dev/blog/one-billion-row-challenge/
Apache License 2.0
6k stars 1.8k forks source link

Validation script to handle unsorted results and all results #623

Closed ericxiao251 closed 6 months ago

ericxiao251 commented 7 months ago

I had written a validation script before knowing that such a script already existed upstream (https://github.com/gunnarmorling/1brc/blob/main/test.sh), but I noticed that the current testing would require both the ordering of the keys in the end maps of the fork to be the same as the main.

If desirable and my understanding is correct I can incorporate my script into the existing test.sh as my script might help determine the correctness of solutions that do not use an ordered map to store results. Additionally, this script also will check the full output and not just a sample.

gunnarmorling commented 6 months ago

Hey @ericxiao251, thanks a lot for looking into this and sorry for the late reply. Results actually must be ordered by key as per the requirements ("sorted alphabetically by station name"), hence the existing test.sh script doesn't do any sorting, as it expects the asserted output to be sorted correctly. In that light, I think we can close this PR. Thx again!