apple / swift-distributed-actors

Peer-to-peer cluster implementation for Swift Distributed Actors
https://apple.github.io/swift-distributed-actors/
Apache License 2.0
597 stars 56 forks source link

FAILED: ClusterLeaderActionsClusteredTests.test_singleLeader #945

Open ktoso opened 2 years ago

ktoso commented 2 years ago

Maybe just flaky, but happened on updating NIO so maybe timings changed and the test was too sensitive to them?

https://ci.swiftserver.group/job/swift-distributed-actors-nightly-prb/357/consoleFull

18:22:46 Test Case 'ClusterLeaderActionsClusteredTests.test_singleLeader' started at 2022-06-05 09:22:46.980
18:22:52 [pprint][9:22:46.9840] [/code/Sources/DistributedActors/Cluster/ClusterShell.swift:271][thread:140641564575488]: Started ClusterShell
18:22:52 <EXPR>:0: error: ClusterLeaderActionsClusteredTests.test_singleLeader : threw error "
18:22:52         switch try p.expectMessage() {
18:22:52                     ^~~~~~~~~~~~~~
18:22:52 error: Did not receive message of type [Event] within [5s], error: noMessagesInQueue"
18:22:52 ------------------------------------- ClusterSystem(first, sact://first@127.0.0.1:7111) ------------------------------------------------
18:22:52 Captured log [first][2022-06-05 9:22:46.9850] [ClusterSystem.swift:894] [warning] Assign identity
18:22:52 // metadata:
18:22:52 // "actor/id": /system/receptionist
18:22:52 // "actor/id/uniqueNode": sact://first@127.0.0.1:7111
18:22:52 // "actor/type": OpLogDistributedReceptionist
18:22:52 Captured log [first][2022-06-05 9:22:46.9850] [ClusterShell.swift:409][/system/cluster] [info] Binding to: [sact://first@127.0.0.1:7111]
18:22:52 Captured log [first][2022-06-05 9:22:46.9850] [ClusterSystem.swift:907] [trace] Actor ready
18:22:52 // metadata:
18:22:52 // "actor/id": /system/receptionist
18:22:52 // "actor/type": OpLogDistributedReceptionist
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [Leadership.swift:114][/system/cluster/leadership] [trace] Configured with LowestReachableMember(minimumNumberOfMembersToDecide: 1, loseLeadershipIfBelowMinNrOfMembers: false)
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterSystem.swift:894] [warning] Assign identity
18:22:52 // metadata:
18:22:52 // "actor/id": /system/downingStrategy
18:22:52 // "actor/id/uniqueNode": sact://first@127.0.0.1:7111
18:22:52 // "actor/type": DowningStrategyShell
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterSystem.swift:907] [trace] Actor ready
18:22:52 // metadata:
18:22:52 // "actor/id": /system/downingStrategy
18:22:52 // "actor/type": DowningStrategyShell
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterEventStream.swift:44][/system/clusterEvents] [trace] Successfully subscribed [_ActorRef<Cluster.Event>(/system/cluster/leadership)], offering membership snapshot
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [Leadership.swift:299][/system/cluster/leadership] [debug] Selected new leader: [nil -> Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable)]
18:22:52 // metadata:
18:22:52 // "leadership/election": DistributedActors.Leadership.LowestReachableMember
18:22:52 // "membership": Membership(count: 1, leader: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable), members: [Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)])
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterEventStream.swift:80][/system/clusterEvents] [trace] Successfully added async subscriber [ObjectIdentifier(0x00007fea2004f840)]
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterSystem.swift:391] [info] ClusterSystem [first] initialized, listening on: sact://first@127.0.0.1:7111
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterSystem.swift:393] [info] Setting in effect: .autoLeaderElection: lowestReachable(minNumberOfMembers: 1)
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterSystem.swift:394] [info] Setting in effect: .downingStrategy: timeout(DistributedActors.TimeoutBasedDowningStrategySettings(downUnreachableMembersAfter: TimeAmount(1s, nanoseconds: 1000000000)))
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterSystem.swift:395] [info] Setting in effect: .onDownAction: gracefulShutdown(delay: TimeAmount(3s, nanoseconds: 3000000000))
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [_OperationLogClusterReceptionistBehavior.swift:101][/system/receptionist-ref] [debug] Initialized receptionist
18:22:52 Captured log [first][2022-06-05 9:22:46.9860] [ClusterEventStream.swift:80][/system/clusterEvents] [trace] Successfully added async subscriber [ObjectIdentifier(0x000055f4a0f8dff0)]
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterShell.swift:420][/system/cluster] [info] Bound to [IPv4]127.0.0.1/127.0.0.1:7111
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterEventStream.swift:44][/system/clusterEvents] [trace] Successfully subscribed [_ActorRef<Cluster.Event>(/system/nodeDeathWatcher/$sub-DistributedActors.Cluster.Event-y)], offering membership snapshot
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterEventStream.swift:44][/system/clusterEvents] [trace] Successfully subscribed [_ActorRef<Cluster.Event>(/system/receptionist-ref/$sub-DistributedActors.Cluster.Event-y)], offering membership snapshot
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterShellState.swift:428][/system
18:22:52 /cluster] [debug] Leader change: LeadershipChange(oldLeader: nil, newLeader: Optional(Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)), file: "/code/Sources/DistributedActors/Cluster/Cluster+Membership.swift", line: 355)
18:22:52 // metadata:
18:22:52 // "membership/count": 1
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterShellState.swift:465][/system/cluster] [trace] Membership updated on [sact://first@127.0.0.1:7111] by leadershipChange(DistributedActors.Cluster.LeadershipChange(oldLeader: nil, newLeader: Optional(Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)), file: "/code/Sources/DistributedActors/Cluster/Cluster+Membership.swift", line: 355)): leader: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable)
18:22:52   sact://first:15367798082298357311@127.0.0.1:7111 status [joining]
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [Gossiper+Shell.swift:149][/system/cluster/gossip] [trace] Update (locally) gossip payload [membership]
18:22:52 // metadata:
18:22:52 // "gossip/identifier": membership
18:22:52 // "gossip/payload": MembershipGossip(
18:22:52 //   owner: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:52 //   seen: Cluster.Gossip.SeenTable(
18:22:52 //     sact://first@127.0.0.1:7111 observed versions:
18:22:52 //         uniqueNode:sact://first@127.0.0.1:7111 @ 1
18:22:52 // ),
18:22:52 //   membership: Membership(
18:22:52 //     _members: [
18:22:52 //       sact://first@127.0.0.1:7111: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable),
18:22:52 //     ],
18:22:52 //     _leaderNode: nil,
18:22:52 //   ),
18:22:52 // )
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterEventStream.swift:65][/system/clusterEvents] [trace] Published event membershipChange(sact://first@127.0.0.1:7111 :: [unknown] -> [joining]) to 3 subscribers and 2 async subscribers
18:22:52 // metadata:
18:22:52 // "eventStream/asyncSubscribers": 
18:22:52 //   ObjectIdentifier(0x000055f4a0f8dff0)
18:22:52 //   ObjectIdentifier(0x00007fea2004f840)
18:22:52 // "eventStream/event": DistributedActors.Cluster.Event.membershipChange(sact://first@127.0.0.1:7111 :: [unknown] -> [joining])
18:22:52 // "eventStream/subscribers": 
18:22:52 //   /system/nodeDeathWatcher/$sub-DistributedActors.Cluster.Event-y
18:22:52 //   /system/receptionist-ref/$sub-DistributedActors.Cluster.Event-y
18:22:52 //   /system/cluster/leadership
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [Leadership.swift:299][/system/cluster/leadership] [debug] Selected new leader: [nil -> Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable)]
18:22:52 // metadata:
18:22:52 // "leadership/election": DistributedActors.Leadership.LowestReachableMember
18:22:52 // "membership": Membership(count: 1, leader: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable), members: [Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)])
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterEventStream.swift:44][/system/clusterEvents] [trace] Successfully subscribed [_ActorRef<Cluster.Event>(/system/cluster/gossip/$sub-DistributedActors.Cluster.Event-y)], offering membership snapshot
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [Gossiper+Shell.swift:149][/system/cluster/gossip] [trace] Update (locally) gossip payload [membership]
18:22:52 // metadata:
18:22:52 // "gossip/identifier": membership
18:22:52 // "gossip/payload": MembershipGossip(
18:22:52 //   owner: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:52 //   seen: Cluster.Gossip.SeenTable(
18:22:52 //     sact://first@127.0.0.1:7111 observed versions:
18:22:52 //         uniqueNode:sact://first@127.0.0.1:7111 @ 2
18:22:52 // ),
18:22:52 //   membership: Membership(
18:22:52 //     _members: [
18:22:52 //       sact://first@127.0.0.1:7111: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable),
18:22:52 //     ],
18:22:52 //     _leaderNode: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:52 //   ),
18:22:52 // )
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterShell+LeaderActions.swift:77][/system/cluster] [trace] Performing leader actions: [DistributedActors.ClusterShellState.LeaderAction.moveMember(sact://first@127.0.0.1:7111 :: [join
18:22:52 ing] -> [     up])]
18:22:52 // metadata:
18:22:52 // "gossip/converged": true
18:22:52 // "gossip/current": MembershipGossip(owner: sact://first:15367798082298357311@127.0.0.1:7111, seen: Cluster.MembershipGossip.SeenTable([sact://first:15367798082298357311@127.0.0.1:7111: [uniqueNode:sact://first@127.0.0.1:7111: 3]]), membership: Membership(count: 1, leader: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable), members: [Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)]))
18:22:52 // "leader/actions": [DistributedActors.ClusterShellState.LeaderAction.moveMember(sact://first@127.0.0.1:7111 :: [joining] -> [     up])]
18:22:52 // "tag": leader-action
18:22:52 Captured log [first][2022-06-05 9:22:46.9870] [ClusterShell+LeaderActions.swift:127][/system/cluster] [debug] Leader moved member: sact://first@127.0.0.1:7111 :: [joining] -> [     up]
18:22:52 // metadata:
18:22:52 // "tag": leader-action
18:22:52 Captured log [first][2022-06-05 9:22:46.9880] [ClusterShell+LeaderActions.swift:104][/system/cluster] [trace] Membership state after leader actions: Membership(count: 1, leader: Member(sact://first@127.0.0.1:7111, status: up, reachability: reachable), members: [Member(sact://first:15367798082298357311@127.0.0.1:7111, status: up, reachability: reachable, upNumber: 1)])
18:22:52 // metadata:
18:22:52 // "gossip/before": MembershipGossip(owner: sact://first:15367798082298357311@127.0.0.1:7111, seen: Cluster.MembershipGossip.SeenTable([sact://first:15367798082298357311@127.0.0.1:7111: [uniqueNode:sact://first@127.0.0.1:7111: 3]]), membership: Membership(count: 1, leader: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable), members: [Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)]))
18:22:52 // "gossip/current": MembershipGossip(owner: sact://first:15367798082298357311@127.0.0.1:7111, seen: Cluster.MembershipGossip.SeenTable([sact://first:15367798082298357311@127.0.0.1:7111: [uniqueNode:sact://first@127.0.0.1:7111: 4]]), membership: Membership(count: 1, leader: Member(sact://first@127.0.0.1:7111, status: up, reachability: reachable), members: [Member(sact://first:15367798082298357311@127.0.0.1:7111, status: up, reachability: reachable, upNumber: 1)]))
18:22:52 // "tag": leader-action
18:22:52 Captured log [first][2022-06-05 9:22:46.9880] [ClusterEventStream.swift:65][/system/clusterEvents] [trace] Published event leadershipChange(DistributedActors.Cluster.LeadershipChange(oldLeader: nil, newLeader: Optional(Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)), file: "/code/Sources/DistributedActors/Cluster/Cluster+Membership.swift", line: 355)) to 4 subscribers and 2 async subscribers
18:22:52 // metadata:
18:22:52 // "eventStream/asyncSubscribers": 
18:22:52 //   ObjectIdentifier(0x000055f4a0f8dff0)
18:22:52 //   ObjectIdentifier(0x00007fea2004f840)
18:22:52 // "eventStream/event": DistributedActors.Cluster.Event.leadershipChange(DistributedActors.Cluster.LeadershipChange(oldLeader: nil, newLeader: Optional(Member(sact://first:15367798082298357311@127.0.0.1:7111, status: joining, reachability: reachable)), file: "/code/Sources/DistributedActors/Cluster/Cluster+Membership.swift", line: 355))
18:22:52 // "eventStream/subscribers": 
18:22:52 //   /system/cluster/leadership
18:22:52 //   /system/cluster/gossip/$sub-DistributedActors.Cluster.Event-y
18:22:52 //   /system/receptionist-ref/$sub-DistributedActors.Cluster.Event-y
18:22:52 //   /system/nodeDeathWatcher/$sub-DistributedActors.Cluster.Event-y
18:22:52 Captured log [first][2022-06-05 9:22:46.9880] [Gossiper+Shell.swift:149][/system/cluster/gossip] [trace] Update (locally) gossip payload [membership]
18:22:52 // metadata:
18:22:52 // "gossip/identifier": membership
18:22:52 // "gossip/payload": MembershipGossip(
18:22:52 //   owner: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:54 //   seen: Cluster.Gossip.SeenTable(
18:22:54 //     sact://first@127.0.0.1:7111 observed versions:
18:22:54 //         uniqueNode:sact://first@127.0.0.1:7111 @ 3
18:22:54 // ),
18:22:54 //   membership: Membership(
18:22:54 //     _members: [
18:22:54 //       sact://first@127.0.0.1:7111: Member(sact://first@127.0.0.1:7111, status: joining, reachability: reachable),
18:22:54 /
18:22:54 /     ],
18:22:54 //     _leaderNode: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:54 //   ),
18:22:54 // )
18:22:54 Captured log [first][2022-06-05 9:22:46.9880] [Gossiper+Shell.swift:149][/system/cluster/gossip] [trace] Update (locally) gossip payload [membership]
18:22:54 // metadata:
18:22:54 // "gossip/identifier": membership
18:22:54 // "gossip/payload": MembershipGossip(
18:22:54 //   owner: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:54 //   seen: Cluster.Gossip.SeenTable(
18:22:54 //     sact://first@127.0.0.1:7111 observed versions:
18:22:54 //         uniqueNode:sact://first@127.0.0.1:7111 @ 4
18:22:54 // ),
18:22:54 //   membership: Membership(
18:22:54 //     _members: [
18:22:54 //       sact://first@127.0.0.1:7111: Member(sact://first@127.0.0.1:7111, status: up, reachability: reachable),
18:22:54 //     ],
18:22:54 //     _leaderNode: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:54 //   ),
18:22:54 // )
18:22:54 Captured log [first][2022-06-05 9:22:46.9880] [ClusterEventStream.swift:65][/system/clusterEvents] [trace] Published event membershipChange(sact://first@127.0.0.1:7111 :: [joining] -> [     up]) to 4 subscribers and 2 async subscribers
18:22:54 // metadata:
18:22:54 // "eventStream/asyncSubscribers": 
18:22:54 //   ObjectIdentifier(0x000055f4a0f8dff0)
18:22:54 //   ObjectIdentifier(0x00007fea2004f840)
18:22:54 // "eventStream/event": DistributedActors.Cluster.Event.membershipChange(sact://first@127.0.0.1:7111 :: [joining] -> [     up])
18:22:54 // "eventStream/subscribers": 
18:22:54 //   /system/cluster/leadership
18:22:54 //   /system/cluster/gossip/$sub-DistributedActors.Cluster.Event-y
18:22:54 //   /system/receptionist-ref/$sub-DistributedActors.Cluster.Event-y
18:22:54 //   /system/nodeDeathWatcher/$sub-DistributedActors.Cluster.Event-y
18:22:54 Captured log [first][2022-06-05 9:22:46.9880] [Gossiper+Shell.swift:149][/system/cluster/gossip] [trace] Update (locally) gossip payload [membership]
18:22:54 // metadata:
18:22:54 // "gossip/identifier": membership
18:22:54 // "gossip/payload": MembershipGossip(
18:22:54 //   owner: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:54 //   seen: Cluster.Gossip.SeenTable(
18:22:54 //     sact://first@127.0.0.1:7111 observed versions:
18:22:54 //         uniqueNode:sact://first@127.0.0.1:7111 @ 4
18:22:54 // ),
18:22:54 //   membership: Membership(
18:22:54 //     _members: [
18:22:54 //       sact://first@127.0.0.1:7111: Member(sact://first@127.0.0.1:7111, status: up, reachability: reachable),
18:22:54 //     ],
18:22:54 //     _leaderNode: sact://first:15367798082298357311@127.0.0.1:7111,
18:22:54 //   ),
18:22:54 // )
18:22:54 Captured log [first][2022-06-05 9:22:46.9890] [ClusterEventStream.swift:44][/system/clusterEvents] [trace] Successfully subscribed [_ActorRef<Cluster.Event>(/user/selfishSingleLeader)], offering membership snapshot
18:22:54 ========================================================================================================================
ktoso commented 2 years ago

See also; https://github.com/apple/swift-distributed-actors/issues/918

Happening a lot recently