oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.02k stars 185 forks source link

Java exceptions from CompactHashStore #3447

Open rwstauner opened 8 months ago

rwstauner commented 8 months ago

We know that the Hash implementations are not thread-safe, but in a short period we saw several related errors coming from CompactHashStore. It seems to error more often than PackedHashStore and BucketsHashStore. It might be worth switching the default back to BucketsHashStore for now to reduce java exceptions.

These are the exceptions I've seen:

Caused by: java.lang.NullPointerException: Null receiver values are not supported by libraries.
        at org.graalvm.truffle/com.oracle.truffle.api.library.LibraryFactory.dispatch(LibraryFactory.java:537)
        at org.graalvm.truffle/com.oracle.truffle.api.library.LibraryFactory.create(LibraryFactory.java:294)
        at org.graalvm.ruby/org.truffleruby.core.basicobject.ReferenceEqualNodeGen$Inlined.executeAndSpecialize(ReferenceEqualNodeGen.java:291)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 16 out of bounds for length 16
        at org.graalvm.ruby/org.truffleruby.core.hash.library.CompactHashStore$SetKvAtNode.insertIntoKv(CompactHashStore.java:609)
        at org.graalvm.ruby/org.truffleruby.core.hash.library.CompactHashStore$SetKvAtNode.keyDoesntExist(CompactHashStore.java:581)
        at org.graalvm.ruby/org.truffleruby.core.hash.library.CompactHashStoreFactory$SetKvAtNodeGen$Inlined.execute(CompactHashStoreFactory.java:339)
        at org.graalvm.ruby/org.truffleruby.core.hash.library.CompactHashStore.set(CompactHashStore.java:197)
        at org.graalvm.ruby/org.truffleruby.core.hash.library.CompactHashStoreGen$HashStoreLibraryExports$Cached.set(CompactHashStoreGen.java:366)
        at org.graalvm.ruby/org.truffleruby.core.hash.HashNodes$SetIndexNode.set(HashNodes.java:244)
        at org.graalvm.ruby/org.truffleruby.core.hash.HashNodesFactory$SetIndexNodeFactory$SetIndexNodeGen.execute(HashNodesFactory.java:1189)
        at org.graalvm.ruby/org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
Caused by:
Index -1 out of bounds for length 2048 (java.lang.ArrayIndexOutOfBoundsException)
  from org.truffleruby.core.hash.library.CompactHashStore.deleteKvAndGetV(CompactHashStore.java:410)
  from org.truffleruby.core.hash.library.CompactHashStore.delete(CompactHashStore.java:213)
  from org.truffleruby.core.hash.library.CompactHashStoreGen$HashStoreLibraryExports$Cached.delete(CompactHashStoreGen.java:420)
  from org.truffleruby.core.hash.HashNodes$DeleteNode.delete(HashNodes.java:334)
  from org.truffleruby.core.hash.HashNodesFactory$DeleteNodeFactory$DeleteNodeGen.execute(HashNodesFactory.java:2427)
  from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)

I reduced about a dozen exceptions I collected down to 4 distinct ones and put the full traces in this gist. The files are a bit long which makes it hard to look at, so here are links to the individual files:

Looking at the code quickly it seems like there is probably a race condition between deleting https://github.com/oracle/truffleruby/blob/b35d756b5252dc5767cf9826e911358187af22df/src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java#L412-L415 and inserting https://github.com/oracle/truffleruby/blob/b35d756b5252dc5767cf9826e911358187af22df/src/main/java/org/truffleruby/core/hash/library/CompactHashStore.java#L608-L612

When the error occurs at deletion it would be easy to add checks that the integer is within bounds and return null if not, but for insertion the decision of how to handle an exception is less clear. It might be worth trying to identify what CRuby does.

What keys remain in the hash may be undefined when using a Hash across multiple threads, but in CRuby it doesn't crash (or at least not nearly as often as this seems to).

eregon commented 8 months ago

I changed the default to BucketsHashStore in https://github.com/oracle/truffleruby/pull/3456 until thread-safe Hash lands.