Open m-schmoock opened 4 years ago
Would it be useful to provide an xpub
or ypub
or zpub
to generate addresses to? Or maybe an array of addresses? Because some of the funds will take a while to recover (unilateral closes) and it would be preferable to transfer onchain funds now, then transfer in-channel funds as soon as they become available. Even mutual closes are not atomic, as it involves waiting for HTLCs to resolve or fail, then negotiating, then doing the actual close.
The "decommissioned" state should probably persist, should it not? So on restart of the daemon it is still decommissioned and it will still reject new incoming fundchannel
s and outgoing fundchannel
s and so on. So maybe a variable in the database as well.
As a further step the plugin could also reject any incoming connections that do not match a channel that is to be closed, that way we truly take it offline, while still giving us a chance to close collaboratively.
As a further step the plugin could also reject any incoming connections that do not match a channel that is to be closed, that way we truly take it offline, while still giving us a chance to close collaboratively.
Hm, did I get this right? If we already closed unilaterally any offline or uncooperative channels in the first place, are channels that go online afterwards (while being force closed) still able to do a collaboratively close? Or do you mean we should set the default timeout for force-close to a higher value i.e. 24hr, so we give them a sufficient timeframe to come online before we force close?
Yes, I was thinking about trying our best to close collaboratively, and only after a long timeout actually force-close. So in my mind the sequence should be:
withdraw all
to the specified addresshsm_secret
file to a backup location, so the node can't accidentally be started after the decommission completesThis minimizes our onchain footprint and fees :wink:
This suggests a third API s well: getcommissionstatus
, giving a commisionstatus
field in the result:
"commissioned"
- Currently allowing all incoming connections and channel fundings."decommissioning
" - Still has channels open and not completely removed (closing transactions not yet deeply removed), some onchain funds still held, but rejecting (most) incoming connections and channel fundings."decommissioned"
- No channels remaining, no funds remaining, decomissioned.I think a "fast" decommisision would want to push as much of the funds immediately to target address(es), you might want this in case you suspect key material replication by an unauthorized party, while you might do a "slow" decommission which wants low fees by waiting for everything to come onchain and then sending all the funds in a single tx to the target address, for example if you decide to burn the server you were using as a natural part of replacing servers. But implemeting two forms of decommission is.... twice as complicated as implementing just one. I would think the "fast" decommission where we send funds to the traget address(es) ASAP is better, since upgrading your server is not as much of an emergency as recovering from a suspected key material replication.
@ZmnSCPxj Its all about what to use as default timeout value. I though about using a sufficiently high value to do a slow decommission by default, BUT telling the user via JSON RPC response that slow (default) timeout was choosen, incase he is in a hurry, he should repeat the command with a low timeout value.
Additionally we can offer a fastdecommission
command (bad name, suggestions welcome) that is the same command with a quite low timeout value.
I don't quite see the complexity of switching between the two nodes: we can just have a withdrawable_amount
function that returns 0 if we can't withdraw yet. It just needs to return 0 almost always in slow mode, and it returns any on-chain amount larger than dust in fast mode. By subscribing to all notifications triggered by any fund change we can withdraw as we go or defer until everything is settled.
Fwiw I also prefer shorter RPC names in favor of (optional) command parameters.
I don't quite see the complexity of switching between the two nodes: we can just have a
withdrawable_amount
function that returns 0 if we can't withdraw yet. It just needs to return 0 almost always in slow mode, and it returns any on-chain amount larger than dust in fast mode. By subscribing to all notifications triggered by any fund change we can withdraw as we go or defer until everything is settled.
@cdecker can you try to explain this on other words?
Also, for fast decommission, I think we might want have an option to send LN balances to a safe node first if possible, as waiting for closures and settlement introduces risks we want to avoid in fast mode. Infact, if you think about it, fast mode and slow mode are two different operations. The one with speed and security in mind, and the other one with cost, footprint and operational continuity in mind.
Note: I updated and rewrote parts of the mission statement in the description of this issue
I don't quite see the complexity of switching between the two nodes: we can just have a
withdrawable_amount
function that returns 0 if we can't withdraw yet. It just needs to return 0 almost always in slow mode, and it returns any on-chain amount larger than dust in fast mode. By subscribing to all notifications triggered by any fund change we can withdraw as we go or defer until everything is settled.@cdecker can you try to explain this on other words?
Just a bit of rambling about where we could differentiate the two modes. TL;DR: I don't think we need two completely distinct entrypoints into the decommission flow, we just need to differentiate when it comes to actually executing the withdraw
. In fast mode we'd trigger a withdraw
every time we have sufficient funds (>> dust), while in economic mode we'd wait for everything to be settled. That's the only difference imho.
Also, for fast decommission, I think we might want have an option to send LN balances to a safe node first if possible, as waiting for closures and settlement introduces risks we want to avoid in fast mode. Infact, if you think about it, fast mode and slow mode are two different operations. The one with speed and security in mind, and the other one with cost, footprint and operational continuity in mind.
I think those are two distinct operations: drain channels followed by a decommision. Drain should be done independently of whether we are fast or economical, since it minimizes the number of outputs we need to withdraw
and allowing us to spend more on on-chain fees (pushing our transaction faster in fast mode, and making things more economical in eco mode).
See also: https://arxiv.org/pdf/2007.00764.pdf especially section 4.2, for stuff having to do with closing channels and redirecting the funds back to the owner cold wallet.
openchannel
is currently not a chained hook. So a builtin plugin hooking into openchannel
would prevent user plugins from using the openchannel
hook as well.
Our options are:
openchannel
a chained hook.
{ 'result': 'continue', 'close_to': 'random' }
with different close_to
addresses? We know decommission will not do that, but multiple user plugins might.openchannel
as well.In fast mode we'd trigger a
withdraw
every time we have sufficient funds (>> dust), while in economic mode we'd wait for everything to be settled. That's the only difference imho.
We should probably also consider using feerate=urgent
for fast mode, and feerate=normal
or even feerate=slow
for economical mode.
Rather than put decommissioning status in the database (which either requires the entire commissioning flow to be in lightningd rather than a plugin, or expose some commands that allow most of the commissioning flow to be done in a plugin except for the database access commands which would be nasty footguns), how about an optional file decommissioned
? decommission
would create this file, recommission
would delete this file, and a user could create/delete the file themselves and the decommissioning flow would react to that because of magic inotify
. The file could contain the string "slow" for economical decommissions, and everything else would be fast decommission.
@ZmnSCPxj I address openchannel_hook chainable by https://github.com/ElementsProject/lightning/pull/3960/commits/aecba6fae87b22e4f07bc4bd467c74531dbb19af Will open a dedicated PR for this once I added tests
Description
Add a new pair of
lightning-cli
RPC commands that allow the permanent shutdown of a node. The commands may be calleddecommission
andrecommission
. This was discussed in #3499These commands can be used to close all channels and prevent acceptance of new funded channels by remote peers. Since this is a basic operation (any node goes offline eventually) and also security related (operator may decide its host maybe compromised) we should not have this as an optional python plugin but maybe as an integrated C plugin.
Decommissioning a node for operational reasons should focus on closing channels cooperatively to keep a low onchain footprint. Decommissioning a node on a hurry for security reasons requires a shorter force-close timeout, an external address and likely higher onchain fees.
lightning-cli decommission [address_or_xpub] [timeout] [destination]
Immediately
[destination]
is given, try to send out available channel liquidity to[destination]
pubkey viakeysend
for a given (short) timeout, e.g. 5 minutes.[address_or_xpub]
, if specified.In the meantime
timeout
, defaulting to a slow decommissiontimeout
of 24hr to give remote counterparts a chance to get online in the meantime for cooperative close. If the user is on a hurry, he can repeat the command with a fastertimeout
value.[address_or_xpub]
, if specified.[address_or_xpub]
, if specified.When done
hsm_secret
file to a backup location, so the node can't accidentally be started after the decommission completes.lightning-cli recommission
Cancels an ongoing
decommission
process by:hsm_secret
backup, if priordecommission
was finished.lightning-cli commissionstate
Show the user the state an process of an ongoing decommissioning operation:
Technical Requirements and Stepstones
commission_state
and[address_or_xpub]
(if we go for xpub we also need to count key index)rpc_command('fundchannel')
openchannel
hook chainable. If multiple plugins returnclose_to
, take first one and warn about it.openchannel
for remote fundings, ifdecommission
is running, reject themcommissioned
/decommissioning
/decommissioned
state
to the RPCgetinfo
output.Open Questions and Points of Discussion
address_or_xpub
information etc. Also the user could simply/safely delete the file in oder to stopdecommission
.