times' from test.rb:22:in
Closed tiredenzo closed 2 years ago
I made a script to make it clearer what is going on, please see https://gist.github.com/eregon/93523d087c283ac88d13a75845e5785e
So TruffleRuby doesn't use the full RAM by default, because the underlying VM has a default max heap which is smaller than the full RAM. However, it is possible to make it use all RAM with e.g. --vm.Xmx240g
.
Could you try with --vm.Xmx240g
and also try both --native
(the default) and --jvm
?
RVM doesn't support the --jvm
node, so for that use
ruby-build truffleruby+graalvm-22.0.0.2 ~/.rubies/truffleruby+graalvm-22.0.0.2
.
I think the --jvm
mode is going to be significantly better here due to the better GC.
Hi, thanks for your reply
I installed TruffleRuby on a clean setup with ruby-build as you asked. And added your script to mine, it is indeed better ;)
--native
with --vm.Xmx240g
I did not see any difference
# runs until line 110M
.rubies/truffleruby+graalvm-22.0.0.2/bin/ruby --vm.Xmx240g --native test_no_csv.rb
heap: 40574 MB used, 43427 MB committed, ? max, RSS: 39452 MB, allocated: ? Fatal error: java.lang.OutOfMemoryError: Could not allocate an aligned heap chunk
https://gist.github.com/tiredenzo/9eb69eb064280f46a3a5639958940f9e
- For `--jvm` with `--vm.Xmx240g` sadly it did not work but it went much further. It was also blazingly fast compared to `--native`, comparable to a couple of times of the speed I reached in GoLang for this task.
.rubies/truffleruby+graalvm-22.0.0.2/bin/ruby --vm.Xmx240g --jvm test_no_csv.rb
heap: 119760 MB used, 192928 MB committed, 245760 MB max, RSS: 162246 MB, allocated: ?
truffleruby: an internal exception escaped out of the interpreter, please report it to https://github.com/oracle/truffleruby/issues.
-2147483604 (java.lang.NegativeArraySizeException)
from org.truffleruby.core.hash.library.BucketsHashStore.resize(BucketsHashStore.java:128)
from org.truffleruby.core.hash.library.BucketsHashStore.set(BucketsHashStore.java:264)
from org.truffleruby.core.hash.library.BucketsHashStoreGen$HashStoreLibraryExports$Cached.set(BucketsHashStoreGen.java:141)
from org.truffleruby.core.hash.HashNodes$SetIndexNode.set(HashNodes.java:214)
from org.truffleruby.core.hash.HashNodesFactory$SetIndexNodeFactory$SetIndexNodeGen.execute(HashNodesFactory.java:911)
from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:48)
from jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:655)
test.rb:29:in []=' from test.rb:35:in
block in times' from test.rb:22:in
Here is my code for this test https://gist.github.com/tiredenzo/cd97f74a054d5a1c4b1b99473318920b
I kept my allocation method, this might not have been a good choice for you, I can re-run with yours if you wish
This looks like a bug in our hash implementation when we try to allocate very large buckets. We have an array of capacities, and then allocate an array of that capacity multiplied by a constant, and for the largest defined capacity this overflows the 32 bit array offset.
I will try to reproduce on native (on a machine with enough RAM), that early OutOfMemoryError sounds like a bug.
On JVM that NegativeArraySizeException we see is because of trying to create a very big Hash.
That error could be improved to be clearer.
An array in Java has a (signed) int
size, and so the maximum is 2**31 - 1 = 2 147 483 647 elements
The Hash would have 700_000_000 pairs at the end, the max load factor is 0.75 (otherwise there is too much chaining). So it'd need at strict minimum 700_000_000 / 0.75 = 933 333 333 elements. That still fits in a Java array.
The current logic on resize, based on CRuby IIRC, is to pick the next capacity (1 073 741 909 here), and multiply it by 4 (OVERALLOCATE_FACTOR) so we don't have to resize too soon again. 1 073 741 909 * 4 = 4 294 967 636 which is close to 2**32 and bigger than the max Java array size. But we could probably use a smaller OVERALLOCATE_FACTOR in that case, i.e., use an array size close to the max array size. As long as the load factor would still be < 0.75 it seems fine. If it then needs to resize up again, then we'd fail with an error.
I kept my allocation method, this might not have been a good choice for you, I can re-run with yours if you wish
Could you try with my script as well? We should see on JVM it should be able to use all the RAM. On Native it's probably the same error but worth a try still.
I got access to a machine with 251GB of RAM. I used the latest truffleruby. On my script:
$ time graalvm-ce-java11-22.2.0-dev/bin/ruby --vm.Xmx240g --vm.XX:+PrintGC use_memory.rb
...
heap: 238602 MB used, 238602 MB committed, ? max, RSS: 238172 MB, allocated: 237568 MB
heap: 239626 MB used, 239626 MB committed, ? max, RSS: 239196 MB, allocated: 238592 MB
[Full GC (CollectOnAllocation) 244328451K->244327427K, 0.0276189 secs]
[Full GC (CollectOnAllocation) 245376003K->245376003K, 0.0199192 secs]
heap: 240650 MB used, 240650 MB committed, ? max, RSS: 240220 MB, allocated: 239616 MB
heap: 241674 MB used, 241674 MB committed, ? max, RSS: 241244 MB, allocated: 240640 MB
[Full GC (CollectOnAllocation) 246425603K->246424579K, 0.0263307 secs]
[Full GC (CollectOnAllocation) 247473155K->247473155K, 0.3591485 secs]
heap: 242698 MB used, 242698 MB committed, ? max, RSS: 242196 MB, allocated: 241664 MB
Killed
real 10m37.484s
So it does reach 240GB before being killed. It takes a while though.
So maybe the issue that --native
didn't use all available memory was resolved already?
Using your test_no_csv.rb
script:
$ time graalvm-ce-java11-22.2.0-dev/bin/ruby --vm.Xmx240g --vm.XX:+PrintGC test_no_csv.rb
...
heap: 114185 MB used, 118051 MB committed, ? max, RSS: 123257 MB, allocated: ?
[Full GC (CollectOnAllocation) 121923584K->121920512K, 637.3958737 secs]
395000000
heap: 127044 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
396000000
heap: 127335 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
397000000
heap: 127626 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
398000000
heap: 127917 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
399000000
heap: 128209 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
400000000
heap: 128500 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
401000000
heap: 128791 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
402000000
heap: 129082 MB used, 140611 MB committed, ? max, RSS: 133253 MB, allocated: ?
-2147483604 (java.lang.NegativeArraySizeException)
from org.truffleruby.core.hash.library.BucketsHashStore.resize(BucketsHashStore.java:128)
from org.truffleruby.core.hash.library.BucketsHashStore.set(BucketsHashStore.java:264)
from org.truffleruby.core.hash.library.BucketsHashStoreGen$HashStoreLibraryExports$Cached.set(BucketsHashStoreGen.java:141)
from org.truffleruby.core.hash.HashNodes$SetIndexNode.set(HashNodes.java:238)
from org.truffleruby.core.hash.HashNodesFactory$SetIndexNodeFactory$SetIndexNodeGen.execute(HashNodesFactory.java:981)
from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:48)
from org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:659)
test_no_csv.rb:29:in `[]='
from test_no_csv.rb:29:in `block in <main>'
from <internal:core> core/integer.rb:148:in `times'
from test_no_csv.rb:22:in `<main>'
real 203m12.921s
user 104m1.443s
sys 97m23.765s
The Native Image CE GC is not good enough for this. It would likely work quite a bit better with Native Image EE's G1 GC.
JVM CE:
$ time graalvm-ce-java11-22.2.0-dev/bin/ruby --jvm --vm.Xmx240g --vm.XX:+PrintGC use_memory.rb
...
heap: 237651 MB used, 240736 MB committed, 245760 MB max, RSS: 240079 MB, allocated: 230400 MB
[558.105s][info ][gc] GC(432) Pause Young (Concurrent Start) (G1 Humongous Allocation) 237683M->237651M(240736M) 11.251ms
[558.105s][info ][gc] GC(433) Concurrent Cycle
[558.596s][info ][gc] GC(433) Pause Remark 238707M->238707M(241792M) 3.945ms
[560.520s][info ][gc] GC(433) Pause Cleanup 238739M->238739M(241792M) 1.576ms
[560.705s][info ][gc] GC(433) Concurrent Cycle 2599.997ms
heap: 238707 MB used, 241792 MB committed, 245760 MB max, RSS: 241139 MB, allocated: 231424 MB
[562.767s][info ][gc] GC(434) Pause Young (Concurrent Start) (G1 Humongous Allocation) 238739M->238707M(241792M) 11.377ms
[562.767s][info ][gc] GC(435) Concurrent Cycle
[563.795s][info ][gc] GC(435) Pause Remark 239763M->239763M(242848M) 15.780ms
[565.747s][info ][gc] GC(435) Pause Cleanup 239795M->239795M(242848M) 2.219ms
[566.210s][info ][gc] GC(435) Concurrent Cycle 3442.631ms
heap: 239763 MB used, 242848 MB committed, 245760 MB max, RSS: 242176 MB, allocated: 232448 MB
[568.430s][info ][gc] GC(436) Pause Young (Concurrent Start) (G1 Humongous Allocation) 239795M->239763M(242848M) 24.967ms
[568.430s][info ][gc] GC(437) Concurrent Cycle
Killed
real 10m26.169s
user 6m50.722s
sys 22m17.315s
$ time graalvm-ce-java11-22.2.0-dev/bin/ruby --jvm --vm.Xmx240g --vm.XX:+PrintGC test_no_csv.rb
...
399000000
heap: 123664 MB used, 160800 MB committed, 245760 MB max, RSS: 151366 MB, allocated: ?
[994.426s][info ][gc] GC(49) Pause Young (Concurrent Start) (G1 Evacuation Pause) 123856M->124448M(160800M) 4334.572ms
[994.426s][info ][gc] GC(50) Concurrent Cycle
400000000
heap: 124544 MB used, 160800 MB committed, 245760 MB max, RSS: 158411 MB, allocated: ?
401000000
heap: 124832 MB used, 160800 MB committed, 245760 MB max, RSS: 158411 MB, allocated: ?
402000000
heap: 125120 MB used, 160800 MB committed, 245760 MB max, RSS: 158411 MB, allocated: ?
-2147483604 (java.lang.NegativeArraySizeException)
from org.truffleruby.core.hash.library.BucketsHashStore.resize(BucketsHashStore.java:128)
from org.truffleruby.core.hash.library.BucketsHashStore.set(BucketsHashStore.java:264)
from org.truffleruby.core.hash.library.BucketsHashStoreGen$HashStoreLibraryExports$Cached.set(BucketsHashStoreGen.java:141)
from org.truffleruby.core.hash.HashNodes$SetIndexNode.set(HashNodes.java:238)
from org.truffleruby.core.hash.HashNodesFactory$SetIndexNodeFactory$SetIndexNodeGen.execute(HashNodesFactory.java:981)
from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:48)
from jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:659)
test_no_csv.rb:29:in `[]='
from test_no_csv.rb:29:in `block in <main>'
from <internal:core> core/integer.rb:148:in `times'
from test_no_csv.rb:22:in `<main>'
real 17m45.616s
user 91m7.772s
sys 44m4.004s
I've run on JVM too, and test_no_csv.rb on --native
.
There was no problem on native, it's the same result as on --jvm
.
Both end up with the NegativeArraySizeException
, so I'll look into that next.
I've got a fix for the NegativeArraySizeException
, and I could run a slight variant of the script to the end (https://gist.github.com/eregon/70787386138f339b964fb76496a3f1d9 it uses a little bit less memory):
$ ruby --jvm --vm.Xmx240g --vm.XX:+PrintGC test_no_csv.rb |& tee out.log
...
697
heap: 78752 MB used, 125472 MB committed, 245760 MB max, RSS: 105461 MB, allocated: ?
698
heap: 78848 MB used, 125472 MB committed, 245760 MB max, RSS: 105461 MB, allocated: ?
699
heap: 78912 MB used, 125472 MB committed, 245760 MB max, RSS: 105462 MB, allocated: ?
thanks 👍
So it should work fine now. I didn't compare to CRuby (I don't have a recent CRuby on that machine).
If you want to optimize this workload to make it faster, it's probably best to avoid such a large memory usage/heap (because the GC becomes a lot slower, and growing the Hash takes a lot of time), and process e.g. line-by-line if that's possible.
Also an Array would store the entries much more compactly (and much better use of the cache) than a Hash, and might still allow to search entries fairly quickly via their id
using bsearch
, assuming the ids are in increasing order in the .csv file.
Dear TruffleRuby team,
While trying to load a large CSV file as a Hash, with TruffleRuby I encountered some issues:
The the CSV file weighs about 39G and has 700 millions of lines. The machine I run has 240G of RAM TurffleRuby verison is installed via RVM
truffleruby 22.0.0.2, like ruby 3.0.2, GraalVM CE Native [x86_64-linux]
The script runs and loads up until line ~= 129M and RAM usage is 66 Gigs when it crashes with the following error
Fatal error: java.lang.OutOfMemoryError: Could not allocate an aligned heap chunk
Ful log here: https://gist.github.com/tiredenzo/088f754a716adfada26144ca3b1baff2Issue is that RruffleRuby will not be able to use the whole machine's 240 GB RAM and will crash around 66GB load :
Please note that the same scripts will work with ruby-3, it will loads completely will take about 140/150 G of RAM with ruby-3 On a side note, I also tried loading "sharding" the data accross 10k Hashes to see if it worked better. the programm behavior was the same
To reproduce run the following
Faster (directly in RAM)
ruby test_no_csv.rb
Slower (use disk to generate csv before loading it)
ruby generate_test_csv.rb && ruby load_csv.rb
Thank you