Closed manaschoksi closed 6 years ago
So more debugging insights. We have isolated the problem with Jest client. The queries that take longer than 3 seconds are being timed out by Jest client within GraphAware.
Any way to configure Jest Client through neo4j config?
Thanks for the report @manaschoksi . While fundamentally I advise to query Elasticsearch directly, this is definitely a setting that can be shipped in one of the next versions.
Hello ikwattro, can I do a pull request and push a fix to move timeouts to configuration?
@itgold It would be a pleasure :)
Make sure to follow the flow in https://github.com/graphaware/neo4j-to-elasticsearch/blob/master/src/main/java/com/graphaware/module/es/ElasticSearchConfiguration.java and https://github.com/graphaware/neo4j-to-elasticsearch/blob/master/src/main/java/com/graphaware/module/es/ElasticSearchModuleBootstrapper.java and once you open the PR you will be asked to electronically sign a document.
Thanks !
Thank you, @ikwattro I'm on it!
done. Thank you!
Thank you @itgold @ikwattro !! That was fast.
@manaschoksi @itgold Released a new version with your contribution inside, can you please validate it fixes your issue ? https://products.graphaware.com/download/neo4j-to-elasticsearch/graphaware-neo4j-to-elasticsearch-3.4.0.52.10.jar
@ikwattro Thanks for the release. Our Neo4j is not yet on 3.4, will validate once we are there. For now, we have upgraded to version 3.2.6.51.7, which has hardcoded Jest timeout from 3s to 10s.
If we feel confident with the fix, lets mark this issue as Resolved.
@manaschoksi Great ! I recommend to upgrade to at least 3.3.3.52.8 ( well best to latest actually ) as this version brings a major performance improvement for the replication part ( 13x faster ) -> commit reference : https://github.com/graphaware/neo4j-to-elasticsearch/commit/f3b81cc7ae1f2edd61a8866cb85e1af4961726ed
Recent versions bring as well the ability to replicate with cypher query logic.
Cheers
Hi @manaschoksi
Did you have a chance to upgrade to 3.4 ?
Hello @ikwattro
Sorry for the delayed response, but we are finally moving to Neo4j 3.4. So just tested this and seems like the timeout config gets overwritten when mapping config is set in Bootstrap.
I have a raised the following code review https://github.com/graphaware/neo4j-to-elasticsearch/pull/140
Would be great if you could review and help us release this fix. Also, I just signed the CLA as well.
With help of @ikwattro, we were able to get this tested and resolved.
Using ElasticSearch: 2.4.3 Neo4j: 3.2.3 GraphAware Plugin: 3.2.1.51.14 Neo4j is a 3 Node HA Cluster ElasticSearch is a 3 node cluster with master and 2 data nodes
We are seeing intermittent failures while running the exact same query with the same parameters. 1 out 5 times the query will fail.
Our system is constantly seeing new data, so there is constant sync going on between Neo4j and ElasticSearch. The search queries from Neo4j can happen when Neo4j and ES are syncing. The queries are issued directly from Neo4j Web Console.
One more insight is that we are not seeing such issues when running Neo4j queries that contain ga.es.queryNode. (The node label queryNode goes to is different than queryNodeRaw).
With respect to logs, there is none. Nothing on ElasticSearch side, nor on Neo4j. If there is any way to get more debugs, we can definitely improvise.
The exception we get is this
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure
ga.es.queryNodeRaw
: Caused by: java.lang.RuntimeException: Error while performing query on ElasticSearchOther than that we do see, constant Bulk Operation Succeeded log in Neo4j.
So we enabled, DEBUG logs on ElasticSearch. Here is the traceback
[2018-07-20 17:35:47,681][DEBUG][shield.transport.netty ] [prod-es-master-01-Mona-01] Caught exception while handling client http traffic, closing connection [id: 0x8858e005, /<neo4j-ip>:56854 :> /<elasticsearc-ip>:9200] java.nio.channels.ClosedChannelException at org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:433) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:128) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:99) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:779) at org.jboss.netty.channel.Channels.write(Channels.java:725) at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71) at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:784) at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.handleDownstream(HttpPipeliningHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582) at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(NettyHttpChannel.java:147) at org.elasticsearch.http.HttpServer$ResourceHandlingHttpChannel.sendResponse(HttpServer.java:352) at org.elasticsearch.rest.action.support.RestResponseListener.processResponse(RestResponseListener.java:43) at org.elasticsearch.rest.action.support.RestActionListener.onResponse(RestActionListener.java:49) at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:89) at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:85) at org.elasticsearch.action.support.TransportAction$ResponseFilterChain.proceed(TransportAction.java:212) at org.elasticsearch.shield.action.ShieldActionFilter.apply(ShieldActionFilter.java:150) at org.elasticsearch.action.support.TransportAction$ResponseFilterChain.proceed(TransportAction.java:210) at org.elasticsearch.action.support.TransportAction$FilteredActionListener.onResponse(TransportAction.java:237) at org.elasticsearch.action.support.TransportAction$FilteredActionListener.onResponse(TransportAction.java:223) at org.elasticsearch.action.search.SearchQueryThenFetchAsyncAction$2.doRun(SearchQueryThenFetchAsyncAction.java:138) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)