gunnarmorling / 1brc

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
https://www.morling.dev/blog/one-billion-row-challenge/
Apache License 2.0
6k stars 1.8k forks source link

martin2038: first submission #665

Closed martin1847 closed 6 months ago

martin1847 commented 7 months ago

Check List:

martin1847 commented 7 months ago

Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-1.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-10.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-10000-unique-keys.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-2.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-20.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-3.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-boundaries.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-complex-utf8.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-dot.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-rounding.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-short.txt Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-shortest.txt

martin1847 commented 6 months ago

mvn formatting has done !! please approval , thanks~

gunnarmorling commented 6 months ago

Getting this error when running:

Fatal error: Failed to leave the current IsolateThread context and to detach the current thread. (code 12)

We've had it before, I think it masks an actual other exception. I don't know though what it is. It occurs when running the test on 32 cores, maybe this trips up your chunking logic somehow?

martin1847 commented 6 months ago

Getting this error when running:

Fatal error: Failed to leave the current IsolateThread context and to detach the current thread. (code 12)

We've had it before, I think it masks an actual other exception. I don't know though what it is. It occurs when running the test on 32 cores, maybe this trips up your chunking logic somehow?

maybe , I change to the JVM model instead of Native. pls approval again~

gunnarmorling commented 6 months ago

Can run it now, but it produces incorrect output for the 10K keyset test (see create_measurements_3.sh).

martin1847 commented 6 months ago

has fixed the round(mean), that makes the avg value not the same at the Last digit. pls try again~ THANKS. Looking forward to running into closing time~

gunnarmorling commented 6 months ago

Still failing. Getting this exception for the 10K 1B rows set:

Exception in thread "main" java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
    at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1293)
    at dev.morling.onebrc.CalculateAverage_martin2038.lambda$main$0(CalculateAverage_martin2038.java:96)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
    at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
    at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:960)
    at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:934)
    at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
    at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:759)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
    at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:676)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:927)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:264)
    at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:677)
    at dev.morling.onebrc.CalculateAverage_martin2038.main(CalculateAverage_martin2038.java:117)
martin1847 commented 6 months ago

previously I use the availableProcessors to split the file , when cpu number too few (<8) , then chunk size greater than Integer.MAX_VALUE (2GB, 10K 1B rows about 16GB, my M2/mac has 10 cpus) has fixed !!! expect give me a last chance!

gunnarmorling commented 6 months ago

Looking good now. 00:09.725.