neo4j / neo4j-go-driver

Neo4j Bolt Driver for Go
Apache License 2.0
490 stars 69 forks source link

Neo4j client waits forever for lock leading to goroutines getting blocked #538

Closed DaChartreux closed 11 months ago

DaChartreux commented 11 months ago

Bug

In my production environment, I am using querying neo4J for some data in a Golang service. This service is deployed on multiple instances but on one instance, we found that the server stopped responding and the last log was just before executing the neo4j query. We added a few logs and added profiling, and this issue happened again, this time in another machine. When we checked the profile, we saw that the number of goroutines kept on increasing, and the graph shows this...

Neo4j lock issue (1)

Upon taking the entire dump of the goroutine's stack, it is evident that getOrUpdateTable method is waiting for the lock.

goroutine 4374599 [sync.Mutex.Lock, 157 minutes]:
sync.runtime_SemacquireMutex(0xc0068a48b8?, 0x18?, 0xc0068a4948?)
    /usr/local/go/src/runtime/sema.go:77 +0x25
sync.(*Mutex).lockSlow(0xc00027c528)
    /usr/local/go/src/sync/mutex.go:171 +0x15d
sync.(*Mutex).Lock(...)
    /usr/local/go/src/sync/mutex.go:90
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).getOrUpdateTable(0xc00027c500, {0x1ca5630, 0xc008772dc0}, 0x189fe40?, {0x1996373, 0x5}, 0x7fc73be4e1d8?, {0x0, 0x0})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:154 +0xb5
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).GetOrUpdateWriters(0xc00027c500, {0x1ca5630, 0xc008772dc0}, 0x40df65?, {0x1996373, 0x5}, 0x1?, {0x0, 0x0})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:256 +0x5e
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getOrUpdateServers(0xc006fa5e00, {0x1ca5630, 0xc008772dc0}, 0x5?)
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:512 +0x125
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getConnection(0xc006fa5e00, {0x1ca5668?, 0xc009cb1030?}, 0x0, 0x40ff49?)
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:543 +0x387
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeTransactionFunction(0xc006fa5e00, {0x1ca5668, 0xc009cb1030}, 0x0, {0x5?, 0x0?}, 0xc006f455c0, 0xc0078084c0, 0xf8?, 0x3)
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:443 +0x9f
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).runRetriable(0xc006fa5e00, {0x1ca5668, 0xc009cb1030}, 0x1912760?, 0x27bc0a0?, 0x70?, 0x7fc73be4e1d8?, {0x0, 0x0, 0x0})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:424 +0x2c9
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeQueryWrite(0xc0078084c0?, {0x1ca5668?, 0xc009cb1030?}, 0x410805?, {0x0?, 0x1912760?, 0xc0068a4f01?})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:383 +0x45
github.com/neo4j/neo4j-go-driver/v5/neo4j.ExecuteQuery[...]({0x1ca5668, 0xc009cb1030?}, {0x1ca9a30, 0xc0000f61c0}, {0x1a01231, 0x1ba}, 0xc009cf6030, 0x1a1f9b8, {0xc0068a52a8, 0x1, ...})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/driver_with_context.go:548 +0x4fe
...

This is how I am connecting with my neo4j instance

        logger.Info("Connecting to Neo4J", "config", neo4jConfig)
    ctx := context.Background()
    driver, err := neo4j.NewDriverWithContext(
        neo4jConfig.Host,
        neo4j.BasicAuth(
            neo4jConfig.UserName,
            neo4jConfig.Password,
            "",
        ),
        func(c *neo4jconfig.Config) {
            c = neo4jconfig.
        },
    )

and this is how I query

    result, err := neo4j.ExecuteQuery(ctx, *l.neo4jDriver,
        "neo4j query here",
        map[string]any{
            "jpins":          jpins,
            "customerGrpIds": customerGrpIds,
        }, neo4j.EagerResultTransformer,
        neo4j.ExecuteQueryWithDatabase("neo4j"),
        neo4j.ExecuteQueryWithReadersRouting(),
    )
    if err != nil {
        ...
    }
    ...

Steps to reproduce I don't know why or when this happens, or how to reproduce this.

DaChartreux commented 11 months ago

One thing I noticed is goroutine #3565146. It seems that this is stuck in chan send for 162 minutes, and is causing the blocking of the other goroutines waiting for the lock.

goroutine 3565146 [chan send, 162 minutes]: 1 times: [3565146]
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).getOrUpdateTable(0xc00027c500, {0x1ca5630, 0xc002418050}, 0x189fe40?, {0x1996373, 0x5}, 0x7fc73be4ca68?, {0x0, 0x0})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:186 +0x608
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).GetOrUpdateWriters(0xc00027c500, {0x1ca5630, 0xc002418050}, 0x40df65?, {0x1996373, 0x5}, 0x1?, {0x0, 0x0})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:256 +0x5e
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getOrUpdateServers(0xc001ea5040, {0x1ca5630, 0xc002418050}, 0x5?)
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:512 +0x125
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getConnection(0xc001ea5040, {0x1ca5668?, 0xc000454150?}, 0x0, 0x40ff49?)
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:543 +0x387
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeTransactionFunction(0xc001ea5040, {0x1ca5668, 0xc000454150}, 0x0, {0x5?, 0x0?}, 0xc00064e300, 0xc0026d1100, 0x8?, 0x3)
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:443 +0x9f
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).runRetriable(0xc001ea5040, {0x1ca5668, 0xc000454150}, 0x1912760?, 0x27bc0a0?, 0xa8?, 0x7fc73be4ca68?, {0x0, 0x0, 0x0})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:424 +0x2c9
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeQueryWrite(0xc0026d1100?, {0x1ca5668?, 0xc000454150?}, 0x410805?, {0x0?, 0x1912760?, 0xc00200da01?})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:383 +0x45
github.com/neo4j/neo4j-go-driver/v5/neo4j.ExecuteQuery[...]({0x1ca5668, 0xc000454150?}, {0x1ca9a30, 0xc0000f61c0}, {0x1a01231, 0x1ba}, 0xc001f7b6b0, 0x1a1f9b8, {0xc00200ddb8, 0x1, ...})
    /home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/driver_with_context.go:548 +0x4fe
...
github.com/gin-gonic/gin.(*Context).Next(...)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
gopkg.in/DataDog/dd-trace-go.v1/contrib/gin-gonic/gin.Middleware.func1(0xc000a09000)
    /home/winlist/go/pkg/mod/gopkg.in/!data!dog/dd-trace-go.v1@v1.55.0/contrib/gin-gonic/gin/gintrace.go:67 +0x5a3
github.com/gin-gonic/gin.(*Context).Next(...)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0xc000a09000)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/recovery.go:102 +0x7a
github.com/gin-gonic/gin.(*Context).Next(...)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.LoggerWithConfig.func1(0xc000a09000)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/logger.go:240 +0xde
github.com/gin-gonic/gin.(*Context).Next(...)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0xc000103520, 0xc000a09000)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/gin.go:620 +0x65b
github.com/gin-gonic/gin.(*Engine).ServeHTTP(0xc000103520, {0x1ca3ef8?, 0xc001b7e0e0}, 0xc0002b4200)
    /home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/gin.go:576 +0x1dd
net/http.serverHandler.ServeHTTP({0x1ca1588?}, {0x1ca3ef8?, 0xc001b7e0e0?}, 0x6?)
    /usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc002301290, {0x1ca55f8, 0xc000e050e0})
    /usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 131
    /usr/local/go/src/net/http/server.go:3086 +0x5cb
DaChartreux commented 11 months ago

I have opened a PR for this, #539