iotaledger / iota

Apache License 2.0
17 stars 8 forks source link

[Task (SC-Platform)] Simplify shutdown logic in iota-node crate #1190

Open sergiupopescu199 opened 4 months ago

sergiupopescu199 commented 4 months ago

In iota-node the current implementation which handles the shutdown of the node uses a broadcast channel, it may be overkill as when calling subscribe() from the Sender it creates a new Receiver which could be a bit expensive and besides that, we still need to manage two halves of a channel.

As a simpler solution we could use CancellationToken which has this purpose in mind and it’s easier to work with, when cloning the CancellationToken it increments the reference counter so it is cheaper than the subscribe() counterpart. The refactor should not take much time as it’s self-contained in the same crate.

We already use tokio-util as a workspace dependency and the CancellationToken is also used in different crates such as:

sergiupopescu199 commented 4 months ago

The problem

While there was an attempt to resolve this issues with PR #1304, the refactor seems more complex than expected and the changes leaks to other crates making it more error prone. The conclusion is that it needs more investigation, these are the observations we made during the first attempt when working on the PR.

Our primary goal is to find a more elegant solution for the iota-node to notify everyone the it was shutdown, currently it uses a single capacity broadcast channel, the potential issue with this is that for simple notification implementation is too overkill and it can be simplified i different ways, in this case why a watch channel was not used?

The initial proposal was to use a CanellationToken but the problem is that the IotaNode's shutdown_channel_tx besides notifying the shutdown, it also carries with it the Option<RunWithRange> info which in normal situations is never used but only in some tests.

Some observations

When the iota-node is started/created we can observe that the Option<RunWithRange> is obtained from node's configuration so when the node is created that info is always present, when sending the shutdown signal the Option<RunWithRange> is also sent. In the main the iota-node runtime spawns the node and waits until receives a shutdown signal but in this case the RunWithRange is not used.

The only crate which checks if the shutdown channel has the RunWithRange is the test-cluster, then the iota-e2e-tests crate uses them in two tests

Considerations & Questions

The observation we made is that we cannot use the CancellationToken without dismissing the RunWithRange node configuration feature. So this opens up questions: What does this feature do? Is it used? Is it therefore worthwhile refactoring to remove the channel and the feature? May a watch channel be a good fit?