lni / dragonboat

A feature complete and high performance multi-group Raft library in Go.
Apache License 2.0
4.98k stars 533 forks source link

Support external node registry functions #327

Closed tylerwilliams closed 9 months ago

tylerwilliams commented 9 months ago

This PR allows clients to provide a NodeRegistryFactory function in the Expert config section which will be used to resolve nodes.

This is useful for clients who want to create and manage a node discovery service externally (so it can be used for other things) but still have the dragonboat library use it for dynamic node discovery.

Also adds a test for this new functionality.

Fixes: https://github.com/lni/dragonboat/issues/326

codecov-commenter commented 9 months ago

Codecov Report

Patch coverage is 63.64% of modified lines.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Files Changed Coverage
node.go ø
nodehost.go 63.64%

:loudspeaker: Thoughts on this report? Let us know!.

lni commented 9 months ago

Thanks for the PR.

Could you please have a look at the review comments above. There is also some data race errors when running the new test, log pasted below.

=== RUN   TestExternalNodeRegistryFunction
2023-09-19 08:31:38.652923 I | dragonboat: go version: go1.19.13, linux/amd64
2023-09-19 08:31:38.652958 I | dragonboat: dragonboat version: 4.0.0 (Dev)
2023-09-19 08:31:38.653001 W | config: mutual TLS disabled, communication is insecure
2023-09-19 08:31:38.653134 I | config: using default EngineConfig
2023-09-19 08:31:38.653166 I | config: using default LogDBConfig
2023-09-19 08:31:38.653248 I | dragonboat: DeploymentID set to 1
2023-09-19 08:31:38.660302 I | dragonboat: LogDB info received, shard 0, busy false
2023-09-19 08:31:38.665674 I | dragonboat: LogDB info received, shard 1, busy false
2023-09-19 08:31:38.669765 I | dragonboat: LogDB info received, shard 2, busy false
2023-09-19 08:31:38.674620 I | dragonboat: LogDB info received, shard 3, busy false
2023-09-19 08:31:38.677094 W | gossip: memberlist: Was able to connect to 123e4567-e89b-12d3-a456-426614174000 but other probes failed, network may be misconfigured
2023-09-19 08:31:38.679718 I | dragonboat: LogDB info received, shard 4, busy false
2023-09-19 08:31:38.684090 I | dragonboat: LogDB info received, shard 5, busy false
2023-09-19 08:31:38.689280 I | dragonboat: LogDB info received, shard 6, busy false
2023-09-19 08:31:38.693791 I | dragonboat: LogDB info received, shard 7, busy false
2023-09-19 08:31:38.699416 I | dragonboat: LogDB info received, shard 8, busy false
2023-09-19 08:31:38.704158 I | dragonboat: LogDB info received, shard 9, busy false
2023-09-19 08:31:38.709267 I | dragonboat: LogDB info received, shard 10, busy false
2023-09-19 08:31:38.713627 I | dragonboat: LogDB info received, shard 11, busy false
2023-09-19 08:31:38.718071 I | dragonboat: LogDB info received, shard 12, busy false
2023-09-19 08:31:38.722903 I | dragonboat: LogDB info received, shard 13, busy false
2023-09-19 08:31:38.728906 I | dragonboat: LogDB info received, shard 14, busy false
2023-09-19 08:31:38.733055 I | dragonboat: LogDB info received, shard 15, busy false
2023-09-19 08:31:38.733422 I | logdb: using plain logdb
2023-09-19 08:31:38.734863 I | dragonboat: logdb memory limit: 8192 MBytes
2023-09-19 08:31:38.735371 I | dragonboat: NodeHost ID: 123e4567-e89b-12d3-a456-426614174000
2023-09-19 08:31:38.735401 I | dragonboat: Expert.NodeRegistryFactory was set: using custom registry
2023-09-19 08:31:38.735440 I | dragonboat: filesystem error injection mode enabled: false
2023-09-19 08:31:38.736034 I | transport: transport type: go-tcp-transport
2023-09-19 08:31:38.737214 I | dragonboat: transport type: go-tcp-transport
2023-09-19 08:31:38.737253 I | dragonboat: logdb type: sharded-pebble
2023-09-19 08:31:38.737296 I | dragonboat: nodehost address: localhost:26001
2023-09-19 08:31:38.737322 I | dragonboat: go version: go1.19.13, linux/amd64
2023-09-19 08:31:38.737372 I | dragonboat: dragonboat version: 4.0.0 (Dev)
2023-09-19 08:31:38.737395 W | config: mutual TLS disabled, communication is insecure
2023-09-19 08:31:38.737490 I | config: using default EngineConfig
2023-09-19 08:31:38.737533 I | config: using default LogDBConfig
2023-09-19 08:31:38.737617 I | dragonboat: DeploymentID set to 1
2023-09-19 08:31:38.743690 I | dragonboat: LogDB info received, shard 0, busy false
2023-09-19 08:31:38.748062 I | dragonboat: LogDB info received, shard 1, busy false
2023-09-19 08:31:38.752225 I | dragonboat: LogDB info received, shard 2, busy false
2023-09-19 08:31:38.757578 I | dragonboat: LogDB info received, shard 3, busy false
2023-09-19 08:31:38.762896 I | dragonboat: LogDB info received, shard 4, busy false
2023-09-19 08:31:38.767749 I | dragonboat: LogDB info received, shard 5, busy false
2023-09-19 08:31:38.772257 I | dragonboat: LogDB info received, shard 6, busy false
2023-09-19 08:31:38.777120 I | dragonboat: LogDB info received, shard 7, busy false
2023-09-19 08:31:38.784119 I | dragonboat: LogDB info received, shard 8, busy false
2023-09-19 08:31:38.788517 I | dragonboat: LogDB info received, shard 9, busy false
2023-09-19 08:31:38.793300 I | dragonboat: LogDB info received, shard 10, busy false
2023-09-19 08:31:38.799423 I | dragonboat: LogDB info received, shard 11, busy false
2023-09-19 08:31:38.803587 I | dragonboat: LogDB info received, shard 12, busy false
2023-09-19 08:31:38.808046 I | dragonboat: LogDB info received, shard 13, busy false
2023-09-19 08:31:38.812889 I | dragonboat: LogDB info received, shard 14, busy false
2023-09-19 08:31:38.818779 I | dragonboat: LogDB info received, shard 15, busy false
2023-09-19 08:31:38.819076 I | logdb: using plain logdb
2023-09-19 08:31:38.820205 I | dragonboat: logdb memory limit: 8192 MBytes
2023-09-19 08:31:38.820907 I | dragonboat: NodeHost ID: 123e4567-e89b-12d3-a456-426614174001
2023-09-19 08:31:38.820938 I | dragonboat: Expert.NodeRegistryFactory was set: using custom registry
2023-09-19 08:31:38.820980 I | dragonboat: filesystem error injection mode enabled: false
2023-09-19 08:31:38.821880 I | transport: transport type: go-tcp-transport
2023-09-19 08:31:38.822814 I | dragonboat: transport type: go-tcp-transport
2023-09-19 08:31:38.822883 I | dragonboat: logdb type: sharded-pebble
2023-09-19 08:31:38.822919 I | dragonboat: nodehost address: localhost:26002
2023-09-19 08:31:38.826387 I | dragonboat: [00001:00001] replaying raft logs
2023-09-19 08:31:38.826569 I | raft: [00001:00001] created, initial: true, new: true
2023-09-19 08:31:38.826615 W | config: ElectionRTT is not a magnitude larger than HeartbeatRTT
2023-09-19 08:31:38.826656 I | raft: [00001:00001] raft log rate limit enabled: false, 0
2023-09-19 08:31:38.826715 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00001] t0 became follower
2023-09-19 08:31:38.826801 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00001] t1 became follower
2023-09-19 08:31:38.826860 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00001] t1 added bootstrap ConfigChangeAddNode, 1, 123e4567-e89b-12d3-a456-426614174000
2023-09-19 08:31:38.826919 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00001] t1 added bootstrap ConfigChangeAddNode, 2, 123e4567-e89b-12d3-a456-426614174001
2023-09-19 08:31:38.827428 I | rsm: [00001:00001] no snapshot available during launch
2023-09-19 08:31:38.827563 I | dragonboat: [00001:00001] initialized using <00001:00001:0>
2023-09-19 08:31:38.827605 I | dragonboat: [00001:00001] initial index set to 0
2023-09-19 08:31:38.830797 I | dragonboat: [00001:00002] replaying raft logs
2023-09-19 08:31:38.831038 I | raft: [00001:00002] created, initial: true, new: true
2023-09-19 08:31:38.831088 W | config: ElectionRTT is not a magnitude larger than HeartbeatRTT
2023-09-19 08:31:38.831138 I | raft: [00001:00002] raft log rate limit enabled: false, 0
2023-09-19 08:31:38.831323 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00002] t0 became follower
2023-09-19 08:31:38.831408 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00002] t1 became follower
2023-09-19 08:31:38.831480 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00002] t1 added bootstrap ConfigChangeAddNode, 1, 123e4567-e89b-12d3-a456-426614174000
2023-09-19 08:31:38.831546 I | raft: [f:1,l:0,t:0,c:0,a:0] [00001:00002] t1 added bootstrap ConfigChangeAddNode, 2, 123e4567-e89b-12d3-a456-426614174001
2023-09-19 08:31:38.833185 I | rsm: [00001:00002] no snapshot available during launch
2023-09-19 08:31:38.833398 I | dragonboat: [00001:00002] initialized using <00001:00002:0>
2023-09-19 08:31:38.833461 I | dragonboat: [00001:00002] initial index set to 0
2023-09-19 08:31:38.834893 I | rsm: [00001:00002] applied ADD ccid 0 (1), n00001 (123e4567-e89b-12d3-a456-426614174000)
2023-09-19 08:31:38.835034 I | rsm: [00001:00002] applied ADD ccid 0 (2), n00002 (123e4567-e89b-12d3-a456-426614174001)
2023-09-19 08:31:38.837618 W | dragonboat: [00001:00001] had 2 LocalTick msgs in one batch
2023-09-19 08:31:38.838604 I | rsm: [00001:00001] applied ADD ccid 0 (1), n00001 (123e4567-e89b-12d3-a456-426614174000)
2023-09-19 08:31:38.838684 I | rsm: [00001:00001] applied ADD ccid 0 (2), n00002 (123e4567-e89b-12d3-a456-426614174001)
2023-09-19 08:31:38.853533 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00002] t2 became candidate
2023-09-19 08:31:38.853619 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00002] t2 received RequestVoteResp from n00002
2023-09-19 08:31:38.853673 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00002] t2 sent RequestVote to n00001
2023-09-19 08:31:38.857429 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00001] t1 received RequestVote with higher term (2) from n00002
2023-09-19 08:31:38.857485 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00001] t1 become followerKE after receiving higher term from n00002
2023-09-19 08:31:38.857671 I | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00001] t2 became follower
2023-09-19 08:31:38.857779 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00001] t2 cast vote from n00002 index 2 term 2, log term: 1
2023-09-19 08:31:38.860333 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00002] t2 received RequestVoteResp from n00001
2023-09-19 08:31:38.860407 W | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00002] t2 received 2 votes and 0 rejections, quorum is 2
2023-09-19 08:31:38.860478 I | raft: [f:1,l:2,t:1,c:2,a:2] [00001:00002] t2 became leader
2023-09-19 08:31:38.935478 E | transport: send batch failed, target localhost:26002 (write tcp 127.0.0.1:37262->127.0.0.1:26002: write: connection reset by peer), 2
2023-09-19 08:31:38.935607 W | transport: breaker 123e4567-e89b-12d3-a456-426614174000 to localhost:26002 failed, connect and process failed: write tcp 127.0.0.1:37262->127.0.0.1:26002: write: connection reset by peer
2023-09-19 08:31:38.935682 W | transport: localhost:26002 became unreachable, affected 1 nodes
==================
WARNING: DATA RACE
Write at 0x00c0000eb110 by goroutine 6651:
  runtime.mapassign_faststr()
      /opt/hostedtoolcache/go/1.19.13/x64/src/runtime/map_faststr.go:203 +0x0
  github.com/lni/dragonboat/v4.TestExternalNodeRegistryFunction()
      /home/runner/work/dragonboat/dragonboat/nodehost_test.go:1320 +0xef7
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1493 +0x47

Previous read at 0x00c0000eb110 by goroutine 6961:
  runtime.mapaccess1_faststr()
      /opt/hostedtoolcache/go/1.19.13/x64/src/runtime/map_faststr.go:13 +0x0
  github.com/lni/dragonboat/v4.(*testRegistry).Resolve()
      /home/runner/work/dragonboat/dragonboat/nodehost_test.go:1208 +0xe4
  github.com/lni/dragonboat/v4/internal/transport.(*Transport).send()
      /home/runner/work/dragonboat/dragonboat/internal/transport/transport.go:361 +0xba
  github.com/lni/dragonboat/v4/internal/transport.(*Transport).Send()
      /home/runner/work/dragonboat/dragonboat/internal/transport/transport.go:347 +0x68
  github.com/lni/dragonboat/v4.(*NodeHost).sendMessage()
      /home/runner/work/dragonboat/dragonboat/nodehost.go:1881 +0xf4
  github.com/lni/dragonboat/v4.(*NodeHost).sendMessage-fm()
      <autogenerated>:1 +0x84
  github.com/lni/dragonboat/v4.(*node).sendMessages()
      /home/runner/work/dragonboat/dragonboat/node.go:1011 +0x1b6
  github.com/lni/dragonboat/v4.(*node).processRaftUpdate()
      /home/runner/work/dragonboat/dragonboat/node.go:1108 +0xb3
  github.com/lni/dragonboat/v4.(*engine).processSteps()
      /home/runner/work/dragonboat/dragonboat/engine.go:1353 +0x804
  github.com/lni/dragonboat/v4.(*engine).stepWorkerMain()
      /home/runner/work/dragonboat/dragonboat/engine.go:1254 +0x5e6
  github.com/lni/dragonboat/v4.newExecEngine.func1()
      /home/runner/work/dragonboat/dragonboat/engine.go:1047 +0x98
  github.com/lni/goutils/syncutil.(*Stopper).runWorker.func1()
      /home/runner/go/pkg/mod/github.com/lni/goutils@v1.3.1-0.20220604063047-388d67b4dbc4/syncutil/stopper.go:79 +0x12e

Goroutine 6651 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1493 +0x75d
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1846 +0x99
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.runTests()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1844 +0x7ec
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1726 +0xa84
  main.main()
      _testmain.go:675 +0x2e9

Goroutine 6961 (running) created at:
  github.com/lni/goutils/syncutil.(*Stopper).runWorker()
      /home/runner/go/pkg/mod/github.com/lni/goutils@v1.3.1-0.20220604063047-388d67b4dbc4/syncutil/stopper.go:74 +0x19a
  github.com/lni/goutils/syncutil.(*Stopper).RunWorker()
      /home/runner/go/pkg/mod/github.com/lni/goutils@v1.3.1-0.20220604063047-388d67b4dbc4/syncutil/stopper.go:68 +0xef
  github.com/lni/dragonboat/v4.newExecEngine()
      /home/runner/work/dragonboat/dragonboat/engine.go:1037 +0xa19
  github.com/lni/dragonboat/v4.NewNodeHost()
      /home/runner/work/dragonboat/dragonboat/nodehost.go:366 +0x1486
  github.com/lni/dragonboat/v4.TestExternalNodeRegistryFunction()
      /home/runner/work/dragonboat/dragonboat/nodehost_test.go:1266 +0x8f7
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1493 +0x47
==================
==================
WARNING: DATA RACE
Write at 0x00c00048d178 by goroutine 6651:
  github.com/lni/dragonboat/v4.TestExternalNodeRegistryFunction()
      /home/runner/work/dragonboat/dragonboat/nodehost_test.go:1320 +0xf38
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1493 +0x47

Previous read at 0x00c00048d178 by goroutine 6961:
  github.com/lni/dragonboat/v4.(*testRegistry).Resolve()
      /home/runner/work/dragonboat/dragonboat/nodehost_test.go:1208 +0xee
  github.com/lni/dragonboat/v4/internal/transport.(*Transport).send()
      /home/runner/work/dragonboat/dragonboat/internal/transport/transport.go:361 +0xba
  github.com/lni/dragonboat/v4/internal/transport.(*Transport).Send()
      /home/runner/work/dragonboat/dragonboat/internal/transport/transport.go:347 +0x68
  github.com/lni/dragonboat/v4.(*NodeHost).sendMessage()
      /home/runner/work/dragonboat/dragonboat/nodehost.go:1881 +0xf4
  github.com/lni/dragonboat/v4.(*NodeHost).sendMessage-fm()
      <autogenerated>:1 +0x84
  github.com/lni/dragonboat/v4.(*node).sendMessages()
      /home/runner/work/dragonboat/dragonboat/node.go:1011 +0x1b6
  github.com/lni/dragonboat/v4.(*node).processRaftUpdate()
      /home/runner/work/dragonboat/dragonboat/node.go:1108 +0xb3
  github.com/lni/dragonboat/v4.(*engine).processSteps()
      /home/runner/work/dragonboat/dragonboat/engine.go:1353 +0x804
  github.com/lni/dragonboat/v4.(*engine).stepWorkerMain()
      /home/runner/work/dragonboat/dragonboat/engine.go:1254 +0x5e6
  github.com/lni/dragonboat/v4.newExecEngine.func1()
      /home/runner/work/dragonboat/dragonboat/engine.go:1047 +0x98
  github.com/lni/goutils/syncutil.(*Stopper).runWorker.func1()
      /home/runner/go/pkg/mod/github.com/lni/goutils@v1.3.1-0.20220604063047-388d67b4dbc4/syncutil/stopper.go:79 +0x12e

Goroutine 6651 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1493 +0x75d
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1846 +0x99
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.runTests()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1844 +0x7ec
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1726 +0xa84
  main.main()
      _testmain.go:675 +0x2e9

Goroutine 6961 (running) created at:
  github.com/lni/goutils/syncutil.(*Stopper).runWorker()
      /home/runner/go/pkg/mod/github.com/lni/goutils@v1.3.1-0.20220604063047-388d67b4dbc4/syncutil/stopper.go:74 +0x19a
  github.com/lni/goutils/syncutil.(*Stopper).RunWorker()
      /home/runner/go/pkg/mod/github.com/lni/goutils@v1.3.1-0.20220604063047-388d67b4dbc4/syncutil/stopper.go:68 +0xef
  github.com/lni/dragonboat/v4.newExecEngine()
      /home/runner/work/dragonboat/dragonboat/engine.go:1037 +0xa19
  github.com/lni/dragonboat/v4.NewNodeHost()
      /home/runner/work/dragonboat/dragonboat/nodehost.go:366 +0x1486
  github.com/lni/dragonboat/v4.TestExternalNodeRegistryFunction()
      /home/runner/work/dragonboat/dragonboat/nodehost_test.go:1266 +0x8f7
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1493 +0x47
==================
tylerwilliams commented 9 months ago

Oh missed the data race one -- taking a look at that now.

tylerwilliams commented 9 months ago

OK, fixed the data race too.

lni commented 9 months ago

Cool, thanks.

tylerwilliams commented 9 months ago

Let me know if there's anything else I missed on this PR, otherwise I believe it should be ready to go. Thanks!