neo4j / neo4j-go-driver

Neo4j Bolt Driver for Go
Apache License 2.0
485 stars 68 forks source link

data race on bolt connection shutdown #529

Closed DavidS-ovm closed 7 months ago

DavidS-ovm commented 11 months ago

Steps to reproduce

we see regular (>50% of go test -race runs) data races in the neo4j driver connection shutdown code:

=== RUN   TestStartCleanup/with_zero_duration
==================
WARNING: DATA RACE
Read at 0x00c000602680 by goroutine 1848:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).Close()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:928 +0xe6
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).removeIdleOlderThan.func1()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/server.go:192 +0x61

Previous write at 0x00c000602680 by goroutine 1847:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).setError()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:205 +0x1ba
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).onIoError()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:1138 +0xc4
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).onIoError-fm()
      <autogenerated>:1 +0x64
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).send()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/outgoing.go:263 +0x99
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*messageQueue).send()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/message_queue.go:134 +0x1b4
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).Close()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:931 +0x13c
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.closeAndEmptyConnections.func1()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/server.go:222 +0x61

Goroutine 1848 (running) created at:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).removeIdleOlderThan()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/server.go:192 +0x464
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).removeIdleOlderThanOnServer()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/pool.go:385 +0x157
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).Return()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/pool.go:411 +0x395
  github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).Run()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/session_with_context.go:624 +0xca3
  github.com/overmindtech/revlink/revlink.DeleteExpiredEdges()
      /home/runner/work/revlink/revlink/revlink/cleanup.go:152 +0x2f7
  github.com/overmindtech/revlink/revlink.cleanupAll()
      /home/runner/work/revlink/revlink/revlink/cleanup.go:247 +0x479
  github.com/overmindtech/revlink/revlink.StartCleanup.func1()
      /home/runner/work/revlink/revlink/revlink/cleanup.go:221 +0x1b2

Goroutine 1847 (finished) created at:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.closeAndEmptyConnections()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/server.go:222 +0x1eb
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).startClosing()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/server.go:216 +0x32d
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).deactivate()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/pool.go:505 +0x298
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).OnIoError()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/pool/pool.go:491 +0x5d
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).onIoError()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:1136 +0xa7
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).onIoError-fm()
      <autogenerated>:1 +0x64
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).send()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/outgoing.go:263 +0x99
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*messageQueue).send()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/message_queue.go:134 +0x573
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).run()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:568 +0x4c5
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt5).Run()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/internal/bolt/bolt5.go:621 +0x2b4
  github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).Run()
      /home/runner/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.12.0/neo4j/session_with_context.go:604 +0x5b3
  github.com/overmindtech/revlink/revlink.DeleteExpiredEdges()
      /home/runner/work/revlink/revlink/revlink/cleanup.go:152 +0x2f7
  github.com/overmindtech/revlink/revlink.cleanupAll()
      /home/runner/work/revlink/revlink/revlink/cleanup.go:247 +0x479
  github.com/overmindtech/revlink/revlink.StartCleanup.func1()
      /home/runner/work/revlink/revlink/revlink/cleanup.go:221 +0x1b2
==================

The issue goes away when downgrading to v5.11.0.

StephenCathcart commented 11 months ago

Hi @DavidS-om, thank you. Just confirming that we're aware of some issues with data races (and issues running our unit tests with the -race flag enabled) and it's on our radar.

DavidS-ovm commented 11 months ago

@StephenCathcart thanks for the message. As I wrote above, for now we've pinned to v5.11.0 and are not experiencing the issue, so it's not a burning issue. It only means that we're stuck on that version until this is fixed as updating completely breaks our internal testing.