Closed timrulebosch closed 1 year ago
@timrulebosch thanks for the report! I have not been to reproduce the issue just yet. What server version are you running?
Here are the steps I followed:
docker run --rm \
--env NEO4J_AUTH='neo4j/letmein!' \
--env NEO4J_ACCEPT_LICENSE_AGREEMENT=yes \
--publish=7687:7687 \
--health-cmd "cypher-shell -u neo4j -p 'letmein!' 'RETURN 1'" \
--health-interval 5s \
--health-timeout 5s \
--health-retries 5 \
neo4j:4.4-enterprise
And I ran the following program (with an invalid label):
package main
import (
"context"
"fmt"
"github.com/neo4j/neo4j-go-driver/v5/neo4j"
"strings"
)
func main() {
ctx := context.Background()
driver, err := neo4j.NewDriverWithContext("neo4j://localhost", neo4j.BasicAuth("neo4j", "letmein!", ""))
if err != nil {
panic(err)
}
defer dealWithIt(ctx, driver)
session := driver.NewSession(ctx, neo4j.SessionConfig{})
defer dealWithIt(ctx, session)
for i := 0; i < 5; i++ {
fmt.Printf("Call #%d\n", i+1)
summary, err := session.ExecuteWrite(ctx, addNodeFunc(ctx, []string{"F o o", "Bar"}, "Jane Doe"))
if err != nil {
panic(err)
}
fmt.Printf("Node(s) created: %d\n", summary.(neo4j.ResultSummary).Counters().NodesCreated())
}
}
func addNodeFunc(ctx context.Context, labels []string, name string) neo4j.ManagedTransactionWork {
return func(tx neo4j.ManagedTransaction) (any, error) {
var b strings.Builder
b.WriteString("CREATE (n:" + strings.Join(labels, ":") + " {name: $name}) ")
b.WriteString("RETURN id(n) ")
query := b.String()
result, err := tx.Run(ctx, query, map[string]any{"name": name})
if err != nil {
return nil, err
}
return result.Consume(ctx)
}
}
func dealWithIt(ctx context.Context, closer interface{ Close(context.Context) error }) {
if err := closer.Close(ctx); err != nil {
panic(err)
}
}
which gives me:
Call #1
panic: Neo4jError: Neo.ClientError.Statement.SyntaxError (Invalid input 'o': expected ")", "WHERE", "{" or a parameter (line 1, column 13 (offset: 12))
"CREATE (n:F o o:Bar {name: $name}) RETURN id(n)"
^)
goroutine 1 [running]:
main.main()
Would you be able to update the program and make it crash as in the initial report? Thanks a lot!
Hi,
this was against the latest MemGraph container (memgraph/memgraph-platform). The query was valid, including labels. First run of the query passed, second run resulted in the exception. So, actually, the code was fine. If you don't have the problem, perhaps MemGraph is a factor? It really was as simple as:
CREATE (n:foo {name: "bar"}) RETURN id(n)
... but I included the other code incase that is somehow triggering the resetResponseHandler
However its clear that the resetResponseHandler() is not correctly specified. It does not specify the onIgnored function .... and then later when/if that "not specified" (nil) function is called, the exception is the result.
That seems like something which should be fixed regardless?
I will take a look at your code in the next days. Perhaps its really an interaction with MemGraph.
According to the docs, the RESET
message can never receive IGNORED
messages: https://neo4j.com/docs/bolt/current/bolt/message/#messages-reset. The root issue is on Memgraph's side, I would say.
That being said, I will improve the driver for the next release to not immediately panic, but instead gracefully handle the error.
Thanks!
If I understand it correctly, a RESET sent to the server might respond with either SUCCESS or FAILURE. The resetResponseHandler() is thus configured with those handlers (+ unknown). So its likely that MemGraph is responding (to the RESET) with an Ignore message. And that triggers an unfortunate sequence of events.
I will take a second look, and try to confirm that by tracing the BOLT protocol. Its probably as you suggest, a problem with the MemGraph implementation.
I managed to get a BOLT trace, the C: RESET -> S: IGNORED sequence I guess causes this problem.
From my reading of the BOLT specification, an IGNORED message would be permissible ... but perhaps not expected in this case. https://neo4j.com/docs/bolt/current/bolt/message/#messages-ignored
Anyway, I should just fix my query! Thanks for the insight and assistance.
2023-05-23 10:47:12.821 BOLT C: <MAGIC> 0X6060B017
2023-05-23 10:47:12.821 BOLT C: <HANDSHAKE> 0X00020205 0X00020404 0X00000104 0X00000003
2023-05-23 10:47:12.822 BOLT S: <HANDSHAKE> 0X00000304
2023-05-23 10:47:12.822 BOLT C: HELLO {"credentials":"<redacted>","patch_bolt":["utc"],"principal":"","scheme":"basic","user_agent":"Go Driver/5.8.1"}
2023-05-23 10:47:12.822 BOLT S: SUCCESS {"connection_id":"bolt-1"}
2023-05-23 10:47:12.822 BOLT [bolt-1@localhost:7687] C: BEGIN {}
2023-05-23 10:47:12.823 BOLT [bolt-1@localhost:7687] S: SUCCESS {}
2023-05-23 10:47:12.823 BOLT [bolt-1@localhost:7687] C: RUN "CREATE (n:Model:FMU {name: $name}) RETURN id(n) AS id" {"name":"ESP10CP_sf"} null
2023-05-23 10:47:12.823 BOLT [bolt-1@localhost:7687] C: PULL {"n":1000}
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] S: SUCCESS {"fields":["id"]}
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] S: RECORD [1446]
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] S: SUCCESS {}
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] C: COMMIT
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] S: SUCCESS {}
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] C: BEGIN {}
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] S: FAILURE {"code":"Memgraph.TransientError.MemgraphError.MemgraphError","message":"Unable to commit due to unique constraint violation on :Model(name)"}
2023-05-23 10:47:12.824 BOLT [bolt-1@localhost:7687] C: RESET
2023-05-23 10:47:12.825 BOLT [bolt-1@localhost:7687] S: IGNORED
2023-05-23 10:47:12.825 BOLT [bolt-1@localhost:7687] C: GOODBYE
Thanks for sharing that trace.
It's interesting that the constraint violation shows up right after the 2nd BEGIN
(and is classified as a transient error).
As I said earlier, IGNORED
is not a valid response to RESET
(see Valid Summary Messages
): https://neo4j.com/docs/bolt/current/bolt/message/#messages-reset. Memgraph does not follow the spec here.
The resetResponseHandler does not set an onIgnored callback. If that handler is called for a bolt message of type "ignored" then you get a SIGSEV as follows:
The callback is thus (see
onIgnored: nil
):Adding
onIgnored: onIgnoredNoOp,
to the resetResponseHandler() would likely fix the fault.https://github.com/neo4j/neo4j-go-driver/blob/0eccfa3db544f6639151bf013cb08a3dde97fc70/neo4j/internal/bolt/bolt4.go#LL1078C1-L1091C2
The scenario behind this is fairly simple (and probably stupid) ... however, I'm new to Go ... based of the example, I basically run this function twice with the same parameters, the second time fails.