apache / incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
https://hugegraph.apache.org
Apache License 2.0
2.65k stars 518 forks source link

数据导入报错 read time out和failed to dorequest #885

Closed broken-blade closed 3 years ago

broken-blade commented 4 years ago

存储后端是Rocksdb时,使用程序进行图数据导入,导入了大概 800多万节点,2700多万条边时,报错如下:

> Exception in thread "main" com.baidu.hugegraph.rest.ClientException: Failed to do request
        at com.baidu.hugegraph.rest.RestClient.request(RestClient.java:92)
        at com.baidu.hugegraph.rest.RestClient.get(RestClient.java:197)
        at com.baidu.hugegraph.api.graph.VertexAPI.get(VertexAPI.java:83)
        at com.baidu.hugegraph.driver.GraphManager.getVertex(GraphManager.java:75)
        at com.znv.simulate.insert.HugeGraphInsertion.getOrCreate(HugeGraphInsertion.java:38)
        at com.znv.simulate.insert.HugeGraphInsertion.getOrCreate(HugeGraphInsertion.java:16)
        at com.znv.simulate.insert.InsertionBase.createGraph(InsertionBase.java:57)
        at com.znv.simulate.graphdatabase.HugeGraphDatabase.massiveModeLoading(HugeGraphDatabase.java:36)
        at com.znv.simulate.benchmark.MassiveInsertionBenchmark.benchmarkOne(MassiveInsertionBenchmark.java:26)
        at com.znv.simulate.benchmark.PermutingBenchmarkBase.startBenchmarkInternalOne(PermutingBenchmarkBase.java:26)
        at com.znv.simulate.benchmark.PermutingBenchmarkBase.startBenchmarkInternal(PermutingBenchmarkBase.java:20)
        at com.znv.simulate.benchmark.BenchmarkBase.startBenchmark(BenchmarkBase.java:14)
        at com.znv.simulate.main.GraphDatabaseBenchmark.runBenchmark(GraphDatabaseBenchmark.java:59)
        at com.znv.simulate.main.GraphDatabaseBenchmark.run(GraphDatabaseBenchmark.java:54)
        at com.znv.simulate.main.GraphDatabaseBenchmark.main(GraphDatabaseBenchmark.java:66)

> Caused by: javax.ws.rs.ProcessingException: java.net.SocketTimeoutException: Read timed out
        at org.glassfish.jersey.apache.connector.ApacheConnector.apply(ApacheConnector.java:481)
        at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:255)
        at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:684)
        at org.glassfish.jersey.client.JerseyInvocation$1.call(JerseyInvocation.java:681)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444)
        at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:681)
        at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:411)
        at org.glassfish.jersey.client.JerseyInvocation$Builder.get(JerseyInvocation.java:311)
        at com.baidu.hugegraph.rest.RestClient.lambda$get$4(RestClient.java:198)
        at com.baidu.hugegraph.rest.RestClient.request(RestClient.java:90)
        ... 14 more

> Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
        at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
        at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
        at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
        at org.glassfish.jersey.apache.connector.ApacheConnector.apply(ApacheConnector.java:435)
        ... 26 more
  1. 参考了#431,在rest-server.properties中添加 batch.max_write_ratio配置项,先后设置了80和90,重新进行数据导入,大概导入到上述数据量时,继续报错

2.数据导入方式见#869

Specifications of environment 环境信息

javeme commented 4 years ago

@userhtm 是直接使用client进行导入数据的是吗?可能是机械盘IO压力较大导致响应慢而超时,可以捕获SocketTimeoutException超时错误进行重试的,重试间隔可以根据情况而定,比如1s或10s。

broken-blade commented 4 years ago

@userhtm 是直接使用client进行导入数据的是吗?可能是机械盘IO压力较大导致响应慢而超时,可以捕获SocketTimeoutException超时错误进行重试的,重试间隔可以根据情况而定,比如1s或10s。

@javeme 是直接使用client进行数据导入的;那请问这个超时重试是有相关的api,还是要自己写重试策略

CPJ-data commented 4 years ago

@wuzigod 大佬你好

1.现在hugegraph遇到一个查询耗时特别长,不知大佬在程序中是否有查询逻辑这块,如果有大佬是怎么操作的,是否可以提供一下代码参考一下,这边在程序中使用的gremlin的语法进行查询操作。

2.大佬遇到的这个问题,在跑我的流程中也遇到了,这边逻辑是要不断的查询点和边数据,然后计算路径,流程大概跑个十分钟就出现大佬你贴出来的错误信息。存储到数据库的数据最多的时候不到3W,总体数据400多W。

如果大佬有好的解决办法,麻烦大佬分享一下,非常感谢

broken-blade commented 4 years ago

@CPJ-data 你好,我刚接触HugeGraph不久,并不是什么大佬。。

1.现在hugegraph遇到一个查询耗时特别长,不知大佬在程序中是否有查询逻辑这块,如果有大佬是怎么操作的,是否可以提供一下代码参考一下,这边在程序中使用的gremlin的语法进行查询操作

我的程序里就一个通过id获取顶点的操作,其他都是写操作,代码我在 #869 里贴了

2.大佬遇到的这个问题,在跑我的流程中也遇到了,这边逻辑是要不断的查询点和边数据,然后计算路径,流程大概跑个十分钟就出现大佬你贴出来的错误信息。存储到数据库的数据最多的时候不到3W,总体数据400多W。

关于这个错误,我把HugeClient的超时时间增加了,现在跑程序暂时没有报错了;你也可以试试上面@javeme 说的方法

CPJ-data commented 4 years ago

@wuzigod 你好,因为我也是接触hugegraph不久

我的程序里就一个通过id获取顶点的操作,其他都是写操作,代码我在 #869 ### 里贴了

我也是查询通过一个顶点id进行查询,使用——gremlinManager.gremlin("g.V().hasLabel(" + "\"" + startEntity.getLabel().label + "\"" + ").has(" + "\"" + PropertyKey.eid + "\"" + "," + "\"" + startEntity.getEid() + "\"" + ")").execute()进行查询操作,耗时比较长,无法达到要求;不知道大佬你的查询耗时怎么样?

关于这个错误,我把HugeClient的超时时间增加了,现在跑程序暂时没有报错了;你也可以试试上面@javeme 说的方法

调整hugeclient的超时时间,看了源码设置是20,这个是需要在hugesever进行hugeclient超时时间调整间不?是在conf文件夹下hugegraph.properties这个文件不?我使用的是Cassandra的存储后端,是修改这个#cassandra.read_timeout=20时间不?其它的时间我看了都不满足.....是否方便说一下怎么修改这个超时时间不?

非常感谢大佬的回答

broken-blade commented 4 years ago

@CPJ-data 1.我没具体测试获取顶点耗时,但是根据ID取顶点应该不会耗时太久的 2.HugeClient hugeClient = new HugeClient("http://127.0.0.1:8080", "hugegraph", 60); 第三个参数就是超时时间,单位是秒

tmljob commented 4 years ago

在HBase下面批量接口插入数据时,也存在这个问题,如何解决,求指导? com.baidu.hugegraph.rest.ClientException: Failed to do request at com.baidu.hugegraph.rest.RestClient.request(RestClient.java:128) at com.baidu.hugegraph.rest.RestClient.post(RestClient.java:151) at com.baidu.hugegraph.rest.RestClient.post(RestClient.java:138) at com.baidu.hugegraph.api.graph.VertexAPI.create(VertexAPI.java:56) at com.baidu.hugegraph.driver.GraphManager.addVertices(GraphManager.java:83)

at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) Caused by: javax.ws.rs.ProcessingException: java.net.SocketTimeoutException: Read timed out at org.glassfish.jersey.apache.connector.ApacheConnector.apply(ApacheConnector.java:496) at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:278) at org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$0(JerseyInvocation.java:753) at org.glassfish.jersey.internal.Errors.process(Errors.java:316) at org.glassfish.jersey.internal.Errors.process(Errors.java:298) at org.glassfish.jersey.internal.Errors.process(Errors.java:229) at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:414) at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:752) at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:445) at org.glassfish.jersey.client.JerseyInvocation$Builder.post(JerseyInvocation.java:351) at com.baidu.hugegraph.rest.RestClient.lambda$post$1(RestClient.java:153) at com.baidu.hugegraph.rest.RestClient.request(RestClient.java:126) ... 13 common frames omitted Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at org.glassfish.jersey.apache.connector.ApacheConnector.apply(ApacheConnector.java:450) ... 24 common frames omitted

github-actions[bot] commented 3 years ago

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label