With the Dynamic Protocol state, we now have the ability to eject nodes at any block within the epoch. However, the networking layer is still oblivious about any potential ejections until the new epoch starts.
Proposed Solution
I would suggest adding a notification that is emitted by the protocol state and distributed via the events.Distributor. The ProtocolStateIDCache can listen to this event and update its internally-maintained notion of "latest authorized network participants" (for brevity, I will call this the latest identity table).
add the view that defined this 'latest identity table' to the struct
Per convention, an instance of this 'latest identity table' struct is immutable. The ProtocolStateIDCache maintains an atomic pointer to the latest instance of this struct. The getter methods just get the pointer to the 'latest identity table', making reads entirely lock-free.
When updating the 'latest identity table' (method update), we would instantiate a new 'latest identity table' struct and atomically update the pointer. Thereby, also the writes are completely lock free.
We desire that the update runs in its own thread. We could introduce a queue for the inbound notifications, but that is needlessly complex in my opinion for the following reason:
We only care about the latest identity table. Hence, a StrictMonotonousCounter is sufficient that tracks the latest height that changed the identity table.
All events (EpochTransition, EpochSetupPhaseStarted, EpochCommittedPhaseStarted, and the new NodeEjected) all provide the block header, whose finalization resulted in the latest identity table changing. Since all of these events are emitted on finalization, we can query the identity table by height, which the block header provides.
Whenever a new event comes in, we update the StrictMonotonousCounter in the ProtocolStateIDCache and trigger a notifier.
a worker thread listens to the notifier and updates the latest identity table when it sees a notification
The TimeoutAggregator is a great example of this pattern (though it uses a view, while I am suggesting to use the height here). Inbound notifications execute this logic, where only the newest value is retained. The worker thread executed that logic which guarantees that eventually the latest value will be processed.
repetitions and outdated events are no concern, because the StrictMonotonousCounter will only ever let newer updates pass
Current functional overlap with IdentityDeltas
There is overlap between the ProtocolStateIDCache and the IdentityDeltas. We should take this opportunity to consolidate them (remove one). Alternatively we would need to update both to support ejections.
Definition of Done
Verifying that an ejected node is no longer exposed through the ProtocolStateIDCache's methods GetFlowID, GetPeerID,ByPeerID,ByNodeIDorIdentitiesshould be sufficient. I think we have an integration test that verifies that connections from nodes that aren't exposed by theProtocolStateIDCache` are dropped/refused.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Problem Definition
With the Dynamic Protocol state, we now have the ability to eject nodes at any block within the epoch. However, the networking layer is still oblivious about any potential ejections until the new epoch starts.
Proposed Solution
I would suggest adding a notification that is emitted by the protocol state and distributed via the
events.Distributor
. TheProtocolStateIDCache
can listen to this event and update its internally-maintained notion of "latest authorized network participants" (for brevity, I will call this thelatest identity table
).Details
while we are at it, I would really like to get the following TODO cleaned up: https://github.com/onflow/flow-go/blob/c54e15a6e3754bd6bc255bf115dcc0bcc6d06857/network/p2p/cache/protocol_state_provider.go#L69-L71 this violation of API conventions (potentially blocking the hotPath of consensus) is inherent to all notifications the
ProtocolStateIDCache
ingests. Note that theProtocolStateIDCache
only needs to be eventually consistent (👉 documentation). This allows a subtle but very compact implementation:introduce auxiliary struct encapsulating the 'latest identity table'
Per convention, an instance of this 'latest identity table' struct is immutable. The
ProtocolStateIDCache
maintains an atomic pointer to the latest instance of this struct. The getter methods just get the pointer to the 'latest identity table', making reads entirely lock-free.When updating the 'latest identity table' (method
update
), we would instantiate a new 'latest identity table' struct and atomically update the pointer. Thereby, also the writes are completely lock free.We desire that the update runs in its own thread. We could introduce a queue for the inbound notifications, but that is needlessly complex in my opinion for the following reason:
StrictMonotonousCounter
is sufficient that tracks the latest height that changed the identity table.EpochTransition
,EpochSetupPhaseStarted
,EpochCommittedPhaseStarted
, and the new NodeEjected) all provide the block header, whose finalization resulted in the latest identity table changing. Since all of these events are emitted on finalization, we can query the identity table by height, which the block header provides.StrictMonotonousCounter
in theProtocolStateIDCache
and trigger a notifier.The
TimeoutAggregator
is a great example of this pattern (though it uses a view, while I am suggesting to use the height here). Inbound notifications execute this logic, where only the newest value is retained. The worker thread executed that logic which guarantees that eventually the latest value will be processed.repetitions and outdated events are no concern, because the
StrictMonotonousCounter
will only ever let newer updates passCurrent functional overlap with
IdentityDeltas
There is overlap between the
ProtocolStateIDCache
and theIdentityDeltas
. We should take this opportunity to consolidate them (remove one). Alternatively we would need to update both to support ejections.Definition of Done
Verifying that an ejected node is no longer exposed through the
ProtocolStateIDCache
's methodsGetFlowID
,GetPeerID,
ByPeerID,
ByNodeIDor
Identitiesshould be sufficient. I think we have an integration test that verifies that connections from nodes that aren't exposed by the
ProtocolStateIDCache` are dropped/refused.