facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.24k stars 6.27k forks source link

SIGSEGV (0xb) rocksdb::LRUHandleTable::~LRUHandleTable()+0x40 #5487

Closed javeme closed 3 years ago

javeme commented 5 years ago

Actual behavior

Error info:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000012953c370, pid=53958, tid=0x0000000000002803
#
# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [librocksdbjni3689689085661840086.jnilib+0x3c370]  rocksdb::LRUHandleTable::~LRUHandleTable()+0x40
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Stack: [0x000070000db7c000,0x000070000dc7c000], sp=0x000070000dc79110, free space=1012k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

C  [librocksdbjni3689689085661840086.jnilib+0x3c370]  rocksdb::LRUHandleTable::~LRUHandleTable()+0x40
C  [librocksdbjni3689689085661840086.jnilib+0x3e4fd]  rocksdb::LRUCache::~LRUCache()+0x5d
C  [librocksdbjni3689689085661840086.jnilib+0x266c5]  Java_org_rocksdb_BlockBasedTableConfig_newTableFactoryHandle+0x7a5
C  [librocksdbjni3689689085661840086.jnilib+0x1f8728]  rocksdb::NewBlockBasedTableFactory(rocksdb::BlockBasedTableOptions const&)+0xa8
C  [librocksdbjni3689689085661840086.jnilib+0x4a0a]  Java_org_rocksdb_ColumnFamilyHandle_disposeInternal+0xc0a
C  [librocksdbjni3689689085661840086.jnilib+0x568fb]  rocksdb::ColumnFamilyHandleImpl::~ColumnFamilyHandleImpl()+0x13b
C  [librocksdbjni3689689085661840086.jnilib+0x56b1e]  rocksdb::ColumnFamilyHandleImpl::~ColumnFamilyHandleImpl()+0xe
j  org.rocksdb.ColumnFamilyHandle.disposeInternal(J)V+0
j  org.rocksdb.ColumnFamilyHandle.disposeInternal()V+15
J 2584 C1 org.rocksdb.AbstractImmutableNativeReference.close()V (17 bytes) @ 0x00000001172ccb3c [0x00000001172cc8e0+0x25c]
j  com.baidu.hugegraph.backend.store.rocksdb.RocksDBStdSessions.dropTable(Ljava/lang/String;)V+19
j  com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore.dropTable(Lcom/baidu/hugegraph/backend/store/rocksdb/RocksDBSessions;Ljava/lang/String;)V+5
j  com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore.clear()V+39
j  com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore.truncate()V+5
j  com.baidu.hugegraph.backend.store.AbstractBackendStoreProvider.truncate()V+39
j  com.baidu.hugegraph.HugeGraph.truncateBackend()V+8
j  com.baidu.hugegraph.tinkerpop.TestGraph.truncateBackend()V+4
j  com.baidu.hugegraph.tinkerpop.TestGraph.clearAll(Ljava/lang/String;)V+40
j  com.baidu.hugegraph.tinkerpop.TestGraphProvider.clear(Lorg/apache/tinkerpop/gremlin/structure/Graph;Lorg/apache/commons/configuration/Configuration;)V+130
j  org.apache.tinkerpop.gremlin.GraphManager$ManagedGraphProvider.clear(Lorg/apache/tinkerpop/gremlin/structure/Graph;Lorg/apache/commons/configuration/Configuration;)V+6
j  org.apache.tinkerpop.gremlin.algorithm.generator.CommunityGeneratorTest$DifferentDistributionsTest.shouldGenerateSameGraph()V+223
v  ~StubRoutines::call_stub

Steps to reproduce the behavior

This error will occur every time we run dropTable() in tinkerpop test with rocksdb 6.0.1.

    public void dropTable(String table) throws RocksDBException {
        this.checkValid();

        ColumnFamilyHandle cfh = cf(table);
        this.rocksdb.dropColumnFamily(cfh);
        cfh.close(); // <<<<<<<<<<<<<<<<<<<< core here!!!
        this.cfs.remove(table);
    }

It does not occur with the same code in version 5.8.6, 5.14.2, 5.17.2, 5.18.3.

rocksdb version: https://mvnrepository.com/artifact/org.rocksdb/rocksdbjni/6.0.1

full log: hs_err_pid53958.log

miasantreble commented 5 years ago

Hi, thanks for reporting. Do you think it would be possible to reproduce the error in a standalone test? either in c++ or java? Right now all I can tell is the block_cache ownership got messed up but there isn't much to investigate how that happened.

javeme commented 5 years ago

Ok, I will try to reproduce in a standalone test.

marcdk commented 3 years ago

Hi @javeme @miasantreble

Did you guys perhaps figure out the issue? I'm getting a similar error:

# Problematic frame:
# C  [librocksdbjni-linuxs390x.so+0x25360e]  Java_org_rocksdb_BlockBasedTableConfig_newTableFactoryHandle+0xae6

The build is taking place in IBM Cloud with OpenJDK8:

>> git tag --points-at HEAD
v6.12.7
>> uname -a
Linux df826417d65b 4.15.0-109-generic #110-Ubuntu SMP Tue Jun 23 02:37:28 UTC 2020 s390x s390x s390x GNU/Linux
>> java -version
openjdk version "11.0.8" 2020-07-14
OpenJDK Runtime Environment (build 11.0.8+10-post-Ubuntu-0ubuntu118.04.1)
OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Ubuntu-0ubuntu118.04.1, mixed mode)

Thanks for any help :) hs_err_pid31862.log

adamretter commented 3 years ago

@marcdk Can you resolve the line number in your hs_err_pid file for us around the problematic frame please - https://github.com/facebook/rocksdb/wiki/JNI-Debugging#interpreting-hs_err_pid-files

Also with that we need to know exactly which version of the source code you are using.

marcdk commented 3 years ago

Thank you @adamretter, that was a really helpful link. Both master and your fork point to the same line.

adamretter:feature/travis-s390x e9569b7e8d1904426246da1fadb6e713ec88dedb

# Problematic frame:
# C  [librocksdbjni-linuxs390x.so+0x27d346]  Java_org_rocksdb_BlockBasedTableConfig_newTableFactoryHandle+0xae6
$ addr2line -e ~/librocksdbjni-linux64.so-adam-java11 0x27d346
/usr/include/c++/7/bits/shared_ptr_base.h:697

facebook:rocksdb/master b4cd51d847bb41bb80df999393b2acac4f67219d

# Problematic frame:
# C  [librocksdbjni-linuxs390x.so+0x29065e]  Java_org_rocksdb_BlockBasedTableConfig_newTableFactoryHandle+0xae6
$ addr2line -e ~/librocksdbjni-linux64.so-master-java11 0x29065e
/usr/include/c++/7/bits/shared_ptr_base.h:697

So the offending line is: 697: _Sp_counted_base<_Lp>* __tmp = __r._M_pi;

shared-header-and-logs.tar.gz

EDIT: Both built with the default DEBUG_LEVEL, and DEBUG_LEVEL=1 yields the same offending line.

marcdk commented 3 years ago

I'm still looking and trying the other debug methods and will let you know if I have more information.

adamretter commented 3 years ago

@marcdk Your line numbers don't correspond with the problematic frame. Did you build from source with DEBUG_LEVEL=2 to preserve symbols and debug info?

marcdk commented 3 years ago

I did not... I've rebuilt with DEBUG_LEVEL=2, but I'm again led to a c++ header file.

Stack: [0x000003ffb9580000,0x000003ffb9680000],  sp=0x000003ffb967d3a8,  free space=1012k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni-linuxs390x.so+0xd0aa3a]  std::enable_if<std::__sp_compatible_with<rocksdb::FilterPolicy*, rocksdb::FilterPolicy const*>::value, std::__shared_ptr<rocksdb::FilterPolicy const, (__gnu_cxx::_Lock_policy)2>&>::type std::__shared_ptr<rocksdb::FilterPolicy const, (__gnu_cxx::_Lock_policy)2>::operator=<rocksdb::FilterPolicy>(std::__shared_ptr<rocksdb::FilterPolicy, (__gnu_cxx::_Lock_policy)2> const&)+0x22
$ addr2line -e /usr/lib/jvm/java-11-openjdk-s390x/lib/librocksdbjni-linuxs390x.so 0xd0aa3a
/usr/include/c++/7/bits/shared_ptr_base.h:1195
> /usr/include/c++/7/bits/shared_ptr_base.h:1195

Again thanks for your help @adamretter and I'm continuing to look myself as well.

adamretter commented 3 years ago

Hmm that's a different problematic frame than before. I'm travelling at the moment but I can try and take a look tomorrow

marcdk commented 3 years ago

again, thanks @adamretter! Safe Travels.

marcdk commented 3 years ago

@adamretter, I've figured out the error of my ways! The built .jar file wasn't being used ... Thank you so much for your support!! Appreciate it a lot :)

adamretter commented 3 years ago

@marcdk okay great. Have you solved your issue now?

marcdk commented 3 years ago

Yes, thanks @adamretter

adamretter commented 3 years ago

@marcdk Out of interest, what was the problem?

marcdk commented 3 years ago

Hi @adamretter, the application was bundled with a rocksdbjni-6.8.1.jar, but I had not replaced it with the s390x version ...