Open calvinrzachman opened 5 days ago
[!IMPORTANT]
Review skipped
Auto reviews are limited to specific labels.
:label: Labels to auto review (1)
* llm-reviewPlease check the settings in the CodeRabbit UI or the
.coderabbit.yaml
file in this repository. To trigger a single review, invoke the@coderabbitai review
command.You can disable this status message by setting the
reviews.review_status
tofalse
in the CodeRabbit configuration file.
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Change Description
The (lnd) ChannelRouter cleans up the results store maintained by the HTLCSwitch when it restarts via the CleanStore method. Any network results stored for attempts that are no longer
IN-FLIGHT
are removed from disk. This presents some difficulty when attempting to deploy the ChannelRouter in a separate process from the HTLCSwitch and allowing communication via RPC. Specifically, we may encounter the following scenario:The issue is that we're relying on state kept by an external entity (lnd) to drive state transitions about payment status within the ChannelRouter and that external entity has automated clean up of that state (at least with the way SendOnion RPC works now). There is a slippery scenario in which the lnd instance hits the logic to automate the cleanup of state needed by ChannelRouter to track the payment to completion before the ChannelRouter has actually read it.
In the same way that lnd does not run CleanStore while payments are being processed, a remote ChannelRouter cannot allow lnd to run CleanStore while payments are being processed. There must be synchronization between ChannelRouter and lnd with respect to the maintenance of state in this store.
Solution
For this we can allow lnd to disable automatic cleanup of the Switch result store and add a switchrpc sub-server which supports remote deletion from this store. Then the remote ChannelRouter can clean the result stores of its switches when it restarts by providing an implementation of CleanStore which leverages these RPCs.
NOTE: This has an advantage over adding a "tracked remotely" flag to network results and skipping the deletion of any result with such a flag in that I think that requires a DB migration, where-as suspending automated local deletion in favor of remote deletion via RPC does not.
Will probably relocate SendOnion/TrackOnion rpcs to switchrpc sub-server.
Steps to Test
make itest icase=switch_store_rpc
Questions
toKeep
style map which specifies the list of results to keep since the list of in-flight payments is likely to be shorter than the list of completed payments. This would mean that, instead of multiple calls to DeleteAttemptResult(s), a single call would be delete multiple results and could probably be named CleanResultStore.