Closed JaimePolidura closed 7 months ago
I forgot to mention I'm using graal vm native image to run it
Please run mvn verify and append the changes made by the formatter to this PR.
Done
Please make the shell scripts executable. All these things are part of the PR template and you checked them as done. Sorry to say, but this creates a bit of a frustating experience :(
Sorry, my bad :(
I have made the shell scripts executable, I have also solved some rounding issues which I didn't know I have.
I think it should be ok now.
Hum, one of the tests is failing now on the eval machine (but interestingly not on CI):
+ timeout -v 300 ./test.sh JaimePolidura
Using java version 21.0.2-graal in this shell.
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-10000-unique-keys.txt
4a5
> id10000;1.0;1.0;1.0
9998a10000
> id9999;1.0;1.0;1.0
FAILURE: ./test.sh JaimePolidura failed
Summary
JaimePolidura: command failed or output did not match
I can't figure out what is casuing that problem. The test passess correctly on my machine:
jaime@linux:~/otro/1brc$ ./test.sh JaimePolidura
Using java version 21.0.2-graal in this shell. Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-10000-unique-keys.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-10.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-1.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-20.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-2.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-3.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-boundaries.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-complex-utf8.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-dot.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-rounding.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-shortest.txt Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-short.txt jaime@linux:~/otro/1brc$ ./test.sh JaimePolidura src/test/resources/samples/measurements-10000-unique-keys.txt
jaime@linux:~/otro/1brc$ ./test.sh JaimePolidura src/test/resources/samples/measurements-10000-unique-keys.txt
Using java version 21.0.2-graal in this shell. Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-10000-unique-keys.txt
Is it correct?
If the CI and the eval machine are executing the same program with the same test, it should produce the same output. Maybe the eval machine is using an outdated buggy version of my program?
I have changed the shell interpreter of the prepare_JaimePolidura.sh script to #!/bin/bash instead of #!/bin/sh I'm unsure if that is relevant to the problem.
Tests run on 32 cores on the eval machine, I suppose this messes up your chunking logic somehow. It passes when running only on 8 cores:
numactl --physcpubind=0-7 ./test.sh JaimePolidura
Using java version 21.0.2-graal in this shell.
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-10000-unique-keys.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-10.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-1.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-20.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-2.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-3.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-boundaries.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-complex-utf8.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-dot.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-rounding.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-shortest.txt
Validating calculate_average_JaimePolidura.sh -- src/test/resources/samples/measurements-short.txt
Yes you are right, I had a bug in the chunking logic.
I have committed the changes
Still fails on 32 cores. You should be able to verify locally by hard-coding 32 chunks rather than relying on processor count.
Sorry again :( I only tested it on 10k entries file. Now it should work
Ok, looking good now. 00:05.069.
Check List:
[x] Tests pass (
./test.sh <username>
shows no differences between expected and actual outputs)[x] All formatting changes by the build are committed
[x] Your launch script is named
calculate_average_<username>.sh
(make sure to match casing of your GH user name) and is executable[x] Output matches that of
calculate_average_baseline.sh
[x] For new entries, or after substantial changes: When implementing custom hash structures, please point to where you deal with hash collisions (line number)
Execution time: 2 sec - 5 sec
Execution time of reference implementation: 2min 22 sec
My PC configuration: WSL2 & Windows Intel Core i7-1260P 2.1 GHz 32GB RAM 12 cores