apache / incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
https://hugegraph.apache.org
Apache License 2.0
2.65k stars 518 forks source link

[Bug] hugegraph worker fail to sync vertex and edge schemas from master #2311

Open dongma opened 1 year ago

dongma commented 1 year ago

Bug Type (问题类型)

logic (逻辑设计问题)

Before submit

Environment (环境信息)

hugegraph集群配置(基于rocksdb,采用raft实现主从模式),共3个节点: 1)master节点配置,rest-server.properties核心配置:

restserver.url=http://0.0.0.0:8080
gremlinserver.url=http://0.0.0.0:8182
// rpc client configs (like enable to keep cache consistency)
rpc.remote_url=master:8091,worker_1:8091,worker_2:8091
rpc.client_connect_timeout=20
// raft group initial peers
raft.group_peers=master:8091,worker_1:8091,worker_2:8091
// lightweight load balancing (beta)
server.id=server-46
server.role=master

hugegraph.properties配置:

backend=rocksdb
serializer=binary
store=hugegraph
raft.mode=true

2) worker_1节点配置,rest-server.properties核心配置:

restserver.url=http://0.0.0.0:8080
gremlinserver.url=http://0.0.0.0:8186
rpc.server_host=worker_1
rpc.server_port=8091
rpc.server_timeout=60
// rpc client configs (like enable to keep cache consistency)
rpc.remote_url=master:8091,worker_1:8091,worker_2:8091
// raft group initial peers
raft.group_peers=master:8091,worker_1:8091,worker_2:8091
server.id=worker-45
server.role=worker

hugegraph.properties配置:

backend=rocksdb
serializer=binary
store=hugegraph
raft.mode=true

3) worker_2节点配置,rest-server.properties核心配置:

restserver.url=http://0.0.0.0:8089
gremlinserver.url=http://0.0.0.0:8182
// rpc server configs for multi graph-servers or raft-servers
rpc.server_host=worker_2
rpc.server_port=8091
rpc.server_timeout=60
// rpc client configs (like enable to keep cache consistency)
rpc.remote_url=master:8091,worker_1:8091,worker_2:8091
raft.group_peers=master:8091,worker_1:8091,worker_2:8091
// lightweight load balancing (beta)
server.id=worker-44
server.role=worker

hugegraph.properties配置:

backend=rocksdb
serializer=binary
store=hugegraph
raft.mode=true

Expected & Actual behavior (期望与实际表现)

使用rocksdb构建huge集群,一个master和两个worker,在集群启动后,开始进行通信及同步数据。 master节点为原有节点,2个worker节点为新增节点,master上存在图实例和数据。

期望: 1)master上的图实例、图实例数据、schema信息都可同步到worker节点上; 2)在worker上可以像在master上一样,查询master节点上的数据;

实际表现,2和3是问题: 1)两个worker节点上的hugegraph图实例 是从master节点上拷贝的hugegraph.properties,然后在两个worker节点上调用./bin/init-store.sh创建的图实例; 2)rocksdb的数据应该是同步过来了,在rocksdb的数据文件夹下hugegraph的数据目录。但在worker节点查询有报错,如简单的g.V()hubble页面上查不到数据,worker节点的hugegraph-server.log文件中有报错。 3)在两个worker节点上查看图实例的schema,是没从master节点同步过来。

worker查询数据完整的错误堆栈如下:

2023-09-14 11:44:41 [gremlin-server-exec-2] [ERROR] o.a.h.b.t.AbstractTransaction - Failed to parse entry: 0x92393a3133313032343139393030383136363636: [�9:13102419900816666=
       131024199008166666否7张六8 0b269d0c8537c52bde454b556e3f39d2913102419900816666:��Ƕ�P;<��㴹P=h-实体人表2-日期>张六 E00572727]
java.lang.IllegalArgumentException: Undefined property key with id: '8'
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:163) ~[guava-25.1-jre.jar:?]
        at org.apache.hugegraph.util.E.checkArgument(E.java:52) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.StandardHugeGraph.propertyKey(StandardHugeGraph.java:735) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.parseProperty(BinarySerializer.java:200) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.parseProperties(BinarySerializer.java:235) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.parseVertex(BinarySerializer.java:308) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.readVertex(BinarySerializer.java:477) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.tx.GraphTransaction.parseEntry(GraphTransaction.java:1917) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.MapperIterator.fetch(MapperIterator.java:42) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.FilterIterator.fetch(FilterIterator.java:40) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.FilterIterator.fetch(FilterIterator.java:40) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.LimitIterator.fetch(LimitIterator.java:40) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:149) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:150) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:55) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:150) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:222) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.fill(IteratorUtils.java:62) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.list(IteratorUtils.java:85) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.asList(IteratorUtils.java:382) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.server.handler.HttpGremlinEndpointHandler.lambda$channelRead$1(HttpGremlinEndpointHandler.java:221) ~[gremlin-server-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.function.FunctionUtils.lambda$wrapFunction$0(FunctionUtils.java:36) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:278) ~[gremlin-groovy-3.5.1.jar:3.5.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
        at org.apache.hugegraph.auth.HugeGraphAuthProxy$ContextTask.run(HugeGraphAuthProxy.java:1860) ~[hugegraph-api-1.0.0.jar:1.0.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2023-09-14 11:44:41 [gremlin-server-exec-2] [ERROR] o.a.h.b.t.AbstractTransaction - Failed to parse entry: 0x92393a3133313032343139393030383139393939: [�9:13102419900819999=
       131024199008199996否7张九8 f39de30ce0beb713902c3554d35e5f58913102419900819999:��Â� ;<��߀� =h-实体人表2-日期>张九 E00572727]
java.lang.IllegalArgumentException: Undefined property key with id: '8'
dongma commented 1 year ago

worker节点写入数据后,在master节点查数据,也有报如下的错误,也是解析失败相关:

2023-09-14 17:37:59 [gremlin-server-exec-2] [ERROR] o.a.h.b.t.AbstractTransaction - Failed to parse entry: 0x93313a313533353431313839383138313831383235: [�1:153541189818181825153541189818181825 钱书芹女153541189818181825]
java.lang.IllegalArgumentException: Unexpected varint -1840483783 with too many bytes(6)
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:163) ~[guava-25.1-jre.jar:?]
        at org.apache.hugegraph.util.E.checkArgument(E.java:52) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BytesBuffer.readVInt(BytesBuffer.java:453) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.parseProperties(BinarySerializer.java:234) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.parseVertex(BinarySerializer.java:308) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.serializer.BinarySerializer.readVertex(BinarySerializer.java:477) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.backend.tx.GraphTransaction.parseEntry(GraphTransaction.java:1917) ~[hugegraph-core-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.MapperIterator.fetch(MapperIterator.java:42) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.FilterIterator.fetch(FilterIterator.java:40) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.FilterIterator.fetch(FilterIterator.java:40) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.LimitIterator.fetch(LimitIterator.java:40) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.hugegraph.iterator.WrappedIterator.hasNext(WrappedIterator.java:38) ~[hugegraph-common-1.0.0.jar:1.0.0]
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:149) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:150) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:55) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:150) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:222) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.fill(IteratorUtils.java:62) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.list(IteratorUtils.java:85) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.asList(IteratorUtils.java:382) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.server.handler.HttpGremlinEndpointHandler.lambda$channelRead$1(HttpGremlinEndpointHandler.java:221) ~[gremlin-server-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.util.function.FunctionUtils.lambda$wrapFunction$0(FunctionUtils.java:36) ~[gremlin-core-3.5.1.jar:3.5.1]
        at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:278) ~[gremlin-groovy-3.5.1.jar:3.5.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
        at org.apache.hugegraph.auth.HugeGraphAuthProxy$ContextTask.run(HugeGraphAuthProxy.java:1860) ~[hugegraph-api-1.0.0.jar:1.0.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2023-09-14 17:37:59 [gremlin-server-exec-2] [ERROR] o.a.h.b.t.AbstractTransaction - Failed to parse entry: 0x93313a313533353431313839383138313831383236: [�1:153541189818181826153541189818181826 孙幼蓉男153541189818181826]