apache / incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
https://hugegraph.apache.org
Apache License 2.0
2.62k stars 517 forks source link

[Bug] tx leak when stopping server #2310

Open VGalaxies opened 11 months ago

VGalaxies commented 11 months ago

Bug Type (问题类型)

server status (启动/运行异常)

Before submit

Related PRs:

Environment (环境信息)

Expected & Actual behavior (期望与实际表现)

When I try to stop the server (by script or in IDEA), the following error occurs:

2023-09-14 14:34:00 [gremlin-server-stop] [WARN] o.a.t.g.s.GremlinServer - Exception while closing Graph instance [hugegraph]
java.lang.IllegalStateException: Ensure tx closed in all threads when closing graph 'hugegraph'
    at com.google.common.base.Preconditions.checkState(Preconditions.java:531) ~[guava-25.1-jre.jar:?]
    at org.apache.hugegraph.util.E.checkState(E.java:64) ~[hugegraph-common-1.0.0.jar:1.0.0]
    at org.apache.hugegraph.StandardHugeGraph.close(StandardHugeGraph.java:971) ~[hugegraph-core-1.0.0.jar:1.0.0]
    at org.apache.tinkerpop.gremlin.server.GremlinServer.lambda$null$7(GremlinServer.java:307) ~[gremlin-server-3.5.1.jar:3.5.1]
    at java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4696) ~[?:?]
    at org.apache.tinkerpop.gremlin.server.GremlinServer.lambda$stop$8(GremlinServer.java:304) ~[gremlin-server-3.5.1.jar:3.5.1]
    at java.lang.Thread.run(Thread.java:829) [?:?]

This is because after closing, the refs in TinkerPopTransaction are not zeroed:

public boolean closed() {
    int refs = this.refs.get();
    assert refs >= 0 : refs;
    return refs == 0;
}

By logging the value of refs, I noticed that when configuring org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin as scripts/empty-sample.groovy in conf/gremlin-server.yaml, the refs value is 1. When configured as scripts/example.groovy, the refs value is 2. 🤔

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

javeme commented 11 months ago

This is a known issue, but currently there is no perfect solution. @zyxxoo can you take a look?