talaia-labs / rust-teos

The Eye of Satoshi - Lightning Watchtower
https://talaia.watch
MIT License
128 stars 62 forks source link

Implements auto-register for the watchtower-client #164

Closed sr-gi closed 1 year ago

sr-gi commented 1 year ago

Description

The current version of the watchtower-plugin does not auto-renew subscriptions once they expire or get out of appointment slots. This may be annoying and some users may even be unaware that the data is not being back-up anymore due to the subscription needing a renewal.

This PR fixes this. The approach for the fix is to use the Retirer for it given the logic for unreachable towers and towers with a subscription issue was pretty similar. The way this works from now on is the following:

If an appointment is sent to the tower and there is a subscription error, the data is added to pending, the tower is flagged as subscription_error and the Retrier is notified (so far nothing changed). The Retirer will now try to renew the subscription if the state of the tower is SubscriptionError before sending any further data though. On failure, the behavior will be the same as before, just giving up, however, on success the pending data will be sent to the tower.

Notice that, from now on, if the Retrier gets a subscription error from the tower it won't give up until a re-register has been attempted, therefore this also covers subscription issues faced while retrying pending data.

Notice there is one case in which the Retrier will need an additional cycle to fix a subscription issue and that is if the tower is stopped in such a state. This is due to the tower state not being stored in the database but deduced based on what data is loaded from it on bootstrap. In the case of some pending appointments being found, the tower is flagged as TemporaryUnreachable and retried straightaway. This means that, under these conditions, the Retrier will:

Bug fixes

This PR is also fixing a client-side bug related to re-registering. The client was checking that, on re-register, the available slot count assigned to the new subscription was higher than the total count from the last subscription instead of the remaining count.

sr-gi commented 1 year ago

The necessity of an additional backoff cycle to address subscription issues when restarting can be addressed by implementing cln rpc logic so the plugin can be aware of the current chain tip. Once that data is available the client will be able to do client-side subscription expiry validation, so we'll be able to properly flag towers without probing them.

I've already left a TODO comment on the code regarding this.

sr-gi commented 1 year ago

Also, something that came to mind when implementing this: maybe we should put this feature under a conf flag so it can be disabled (specially for testing). If so, I think it should be enabled by default.

mariocynicys commented 1 year ago

The necessity of an additional backoff cycle to address subscription issues when restarting can be addressed by implementing cln rpc logic so the plugin can be aware of the current chain tip. Once that data is available the client will be able to do client-side subscription expiry validation, so we'll be able to properly flag towers without probing them.

@sr-gi The additional backoff cycle need not to be handled, right? it's done by the backoff strategy already?

sr-gi commented 1 year ago

The necessity of an additional backoff cycle to address subscription issues when restarting can be addressed by implementing cln rpc logic so the plugin can be aware of the current chain tip. Once that data is available the client will be able to do client-side subscription expiry validation, so we'll be able to properly flag towers without probing them.

@sr-gi The additional backoff cycle need not to be handled, right? it's done by the backoff strategy already?

Currently it is handled by the backoff strategy, but it properly implemented it won't require a cycle that will always fail. We need client-side subscription validation for that, which requires querying bitcoind via lightningd's rpc.

sr-gi commented 1 year ago

@mariocynicys issues should have been fixed