Closed ivangreene closed 3 years ago
This seems to have several interrelated factors at play. When you call streams.publish, this is not a guarantee that instantaneously this message will be sent out on kafka. Neo4j-streams uses a kafka client, which is subject to whatever defaults the client has (and any of your kafka.*
settings). When you publish a message, what you actually do is add it to a buffer to be published later.
That buffer then accumulates up to a certain size before sending, and always sends by some timeout. Check the kafka docs for the "producer configuration settings", this is also referenced in the neo4j-streams manual.
As for how much time Neo4j Browser shows to execute a query, this issue would need to be taken up with the browser team, I'm just not sure how that's measured but I see what you're talking about.
You mentioned on slack this is a blocker to your adoption, but what is the blocker? The incorrect amount of time given by Desktop? A point I'd recommend following up on in your configuration is checking the producer configuration options for Kafka clients, and then tuning those to your use. You can for example lower the timeout or the buffer size (to deliver messages more quickly) but there's a tradeoff with throughput. If you do many thousands of messages/sec, you're better off with slightly longer timeouts and bigger buffers to make more efficient use of network. If you want to send 1 message every few seconds, you may end up waiting until the timeout threshold is hit before the buffer flushes to network. This isn't actually neo4j-streams I don't think, it's just how kafka clients work
@moxious re: being a blocker, I will need to do some further testing to see if this actually impacts the amount of time a query takes to execute, or whether it is simply (somehow) a visual delay of the Desktop browser.
The strangest thing that makes me feel like the Desktop return time and the Kafka plugin are related is that this immediately stops being visible when I disable the Kafka plugin. As soon as that is disabled, the Desktop return time matches the time it reports. But when Kafka is enabled, it just seems to wait for it to submit to Kafka to return the result in Desktop. Note that the delay between executing the query and sending to Kafka is not a problem for me, that makes a lot of sense and will perhaps need some tweaking, but the fact that it seems to hold up the return of the query.
Didn't we also have an async mode for the procedures? The procs are synchronous by default.
Ah there was a guy who changed the async behavior :)
https://github.com/neo4j-contrib/neo4j-streams/pull/161/files
So we should re-add that.
Ah, that's an excellent point - this was changed to be synchronous on purpose. The idea was that if it's not synchronous, you could async publish the message, and the message could fail at the kafka client layer and never go out. In this case, it would "fail silently" and there'd be no way for the user to know it, also you couldn't chain cypher queries if it's totally async.
I mean I think probably this is blocking for the network send, and it's not like the query is taking a lot of time to return. Question is if making it async is worth the downsides that we previously had, and worked around to get to the sync behavior.
I hadn't really gotten that far with our concept of it yet, but rolling the transaction back if a Kafka write fails would actually help out a lot (or at least the configurable option for that behavior if not the default). The question is what to do if one or more Kafka writes succeed and other(s) fail, I guess in that case we could transmit the start/rollback/commit of the transaction and ensure that a commit isn't implied in absence of one
Related ticket, which we're going to try to prioritize next week: https://github.com/neo4j-contrib/neo4j-streams/issues/349
@ivangreene can you please try with the last release?
@conker84 back in this area now and I can confirm that the behavior is now as expected, the queries return in the expected time without waiting on submission
Expected Behavior (Mandatory)
The 'completed after' time shown in the Neo4j Desktop browser should closely reflect the actual amount of time from pressing 'Return' to when the message appeared.
Actual Behavior (Mandatory)
Queries say they took between 1 ms and 32 ms, however, the browser did not return this until several seconds later (only after I could see the message in the topic). The queries return almost immediately after disabling
streams.source.enabled
How to Reproduce the Problem
Steps (Mandatory)
kt consume -topic node_actions -offsets all=newest-1:
CREATE (n:RandoLabel {name: 'foo'});
Video:
Video demonstrating this behavior: https://streamable.com/wkuy2o
Specifications (Mandatory)
Relevant configuration:
Versions