apache / incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
https://hugegraph.apache.org
Apache License 2.0
2.63k stars 517 forks source link

A fatal error has been detected by the Java Runtime Environment #966

Open suesunss opened 4 years ago

suesunss commented 4 years ago

Expected behavior 期望表现

HugeGraph 在运行过程中出现如下报错:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f5250a54955, pid=1300, tid=0x00007f5261d40700
#
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 )
# Problematic frame:
# C  [librocksdbjni5664892800863681445.so+0x414955]  rocksdb::CachableEntry<rocksdb::Block>::ReleaseCacheHandle(void*, void*)+0x5
#
# Core dump written. Default location: /runnable/hugegraph-0.10.4/core or core.1300
#
# An error report file with more information is saved as:
# /runnable/hugegraph-0.10.4/hs_err_pid1300.log
Compiled method (nm) 496844379 10997     n 0       org.rocksdb.RocksIterator::disposeInternal (native)
 total in heap  [0x00007f5a76b33450,0x00007f5a76b33790] = 832
 relocation     [0x00007f5a76b33578,0x00007f5a76b335c0] = 72
 main code      [0x00007f5a76b335c0,0x00007f5a76b33788] = 456
 oops           [0x00007f5a76b33788,0x00007f5a76b33790] = 8
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

HugeGraph 进程随即停止。

查了下日志,抛出这个错误的时候,正在用 HugeGraph-tools (1.4.0)进行数据备份操作:

服务端日志:

com.baidu.hugegraph.server.RestServer [] - Graph [xxx] get vertex shards with split size '1048576'
com.baidu.hugegraph.server.RestServer [] - Graph [xxx] query vertices by shard(start: 0, end: 69273666, page: )
com.baidu.hugegraph.server.RestServer [] - Graph [xxx] query vertices by shard(start: 69273666, end: 138547332, page: )

客户端日志:

INFO: I/O exception (org.apache.http.NoHttpResponseException) caught when processing request to {}->http://host:port: The target server failed to respond

page 字段看起来取不到,不知道跟这个是否有关系。

HugeGraph 安装目录下出现 core 文件和相应的日志,看起来和 RocksDB 有关。

Specifications of environment 环境信息

javeme commented 4 years ago

@suesunss 是否有clear操作?如果是的话,应该是触发了rocksdb的bug(不过该bug已经修复了),可以参考下:https://github.com/hugegraph/hugegraph/issues/833

suesunss commented 4 years ago

@javeme 好像不太一样,我没有执行 clear 操作,是在用 tools 进行备份的过程中发生的。Problematic frame 不太一样。

我的:

# Problematic frame:
# C  [librocksdbjni5664892800863681445.so+0x414955]  rocksdb::CachableEntry<rocksdb::Block>::ReleaseCacheHandle(void*, void*)+0x5

#833的:

# Problematic frame:
# C  [librocksdbjni3759649349396566752.so+0x2f4a03]  rocksdb::MemTable::ApproximateMemoryUsage()+0xb3
javeme commented 4 years ago

@suesunss 表现可能会些许不同(由于访问非法指针错误,表现比较随机),但基本都是在disposeInternal时机core的,可以参考rocksdb的issues 5982。