Planning for SPV backend

ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance

Other

2.8k stars 888 forks source link

Planning for SPV backend #3858

Open ZmnSCPxj opened 4 years ago

ZmnSCPxj commented 4 years ago

First off: SPV is worse security than fullnodes. Please use fullnodes if you can.

On the other hand, if you trust a particular cloud service provider, you can run a fullnode on a cloud service, and thus not spend a lot of bandwidth in your own local connection. You might then think to run lightningd locally and point the --bitcoin parameters at the cloud service fullnode, but this actually downloads every block in full and gets you as much bandwidth use as a locally-run bitcoind with blocksonly option. You could instead run an SPV server of some kind on your remote cloud service fullnode, and as long as the cloud service does not secretly replace your bitcoind with minercoin2xd-unlimitedvision, you should be fine. Then we would want support for SPV in a locally-run lightningd to limit the bandwidth needed, and point it only at your trusted cloud service fullnode.

Here is my current plan:

When lightningd/bitcoind object is initialized, it checks for the following two sets of APIs:

getrawblockbyheight, getutxout, estimatefees, getchaininfo, sendrawtransaction - current 0.9.0 set.
gettxesbyheight, getutxobyscid, estimatefees, getchaininfo, sendpsbttomempool, checkspent - next version of the backend plugin API.

gettxesbyheight is given a height to fetch, plus a receive_scriptpubkeys an array of hex string scriptPubKeys to check, and a spend_utxos an array of objects with txid, vout, and scriptPubKey fields. It returns a field found as false if not yet found, or if the block is found it returns found field as true and in addition returns the blockid the hash of the block header, header the hex string dump of the header, and txes an array of hex string transactions in the block that match the given receive_scriptpubkeys or spend_utxos.

getutxobyscid is given an scid of BOLT-standard form BBBxTTxOO. It returns a field found, which is false if the given SCID does not match an unspent SegWit v0 32-byte witness (P2WSH), or a SegWit v1 32-byte witness (Taproot). If true it returns the scriptPubKey of the output as well as the amount.

sendpsbttomempool is given a Base64 PSBT that contains a completely signed transaction, and a allowhighfees parameter, and broadcasts it. We have to switch bitcoind_sendrawtx to use struct bitcoin_tx which conveniently contains a PSBT.

checkspent is given an array of objects. Each object represents a txo with at least txid and vout and scriptPubKey fields, and optional blockheight, which is the known confirmation height of the txo. For each txo, it checks if the txo is spent, and if so, adds it to the spent result, which is an array of objects with txid, vout, and spendheight fields, with 'spendheight': 0 for TXOs that are spent in unconfirmed transactions.

Internally, we also have these changes to lightningd/bitcoind functions:

bitcoind_can_getutxobyscid. This accepts a const struct bitcoind * and returns true if the bitcoin backend implements getutxobyscid.
- If it returns true, it is safe to call a new bitcoind_getutxobyscid and bitcoind_checkspent, but not safe to call bitcoind_getfilteredblock or bitcoind_getutxout.
- If it returns false, it is not safe to call bitcoind_getutxobyscid or bitcoind_checkspent but it is safe to call bitcoind_getfilteredblock and bitcoind_getutxout.
bitcoind_gettxesbyheight. This is always callable. chaintopology should use this API instead of bitcoind_getrawblockbyheight, since the latter is removed.
- The only change in interface relative to bitcoind_getrawblockheight is that bitcoind_gettxesbyheight requires filter arguments. It still returns a struct block albeit one with a possibly-incomplete set of tx.
- If the backend plugin is using the 0.9.0 interface this just calls into getrawblockbyheight and ignores the filter arguments.
As mentioned, bitcoind_sendrawtx should accept struct bitcoin_tx instead of const char *hextx.

Finally, we also add a new lightningd command, checknewblock, which triggers chaintopology into checking for a new block. The backend plugin should call this when the block tip height changes. Normal bitcoin-cli and BIP158 plugins will poll, though a BIP158 plugin could have a peer push a new block at it (not likely unless the peer is a miner). An Electrum plugin would subscribe to blockchain.headers.subscribe. For now we also do polling ourselves in lightningd to support older plugins that do not perform this call.

After we change lightningd, we can then update bcli to use the new interface. In order to check the new interface, gettxesbyheight will also apply filtering to transactions in the block, and will not provide txes that do not match the given filters. getutxobyscid could have a simple on-disk cache: a mapping of SCID to either a tuple of txid and vout, or Nothing. If an SCID exists in the on-disk cache, and it maps to nothing, it is known-invalid SCID, else if it maps to a txid vout the bcli will go query bitcoin-cli getutxout to see if it is still unspent. If the SCID does not exist in the on-disk cache, then we scan the block and update the cache, and we also iterate over all possible SCIDs (e.g. even absurd ones like BBBx16777215x65535) and map those that do not match a P2WSH or Taproot output to Nothing.

Original text:

First off: SPV is worse security than fullnodes. Please use fullnodes if you can.

Currently, the interface to bcli inherently assumes the backend is a fullnode of some kind. This fullnode is always fully trusted to not lie, and we always download every block from the backend.

When doing block synchronization, what we can do is to have a txfilter containing:

UTXOs which, if spent, we want to know about.
Addresses which, if spent from or received into, we want to know about.

Now, it is important that we keep track of UTXOs that we want to track spendedness of channels. However, each such UTXO has its own address as well.

Most SPV protocols (eg BIP158, Electrum) are address-centric. So, we should provide just addresses to the backend. The backend should inform us

So I propose for the bcli getrawblockbyheight:

Add a new required parameter, txfilter, which is an array of scriptPubKey hexdumps.
It may return a result without a block field (but must still return blockhash). This is used to inform lightningd that the block at the given height does not involve any of the addresses in the txfilter.
It has to return a separate blockheader field regardless. This is needed so that chaintopology can recognize reorgs.

Then:

For Electrum, we do blockchain.scripthash.get_history on every listed scriptPubKey, and if any of them has a transaction whose confirmed height equals the height being queried, then we should download that block (elsewhere) and provide it to the lightningd.
- But we should be wary of reorgs. Thus, we should do a blockchain.block.header and save the header, then blockchain.scripthash.get_history for a single scriptpubkey, then a blockchain.block.header and compare if it is the same as previous; if not the same, we should restart the querying again, since the alternate block might now contain the previous scriptpubkey. This is not resilient against very fast ABA problems but hopefully those are rare.
For BIP158, the plugin maintains a header chain, and we download the version 0x00 Neutrino filter for the block header hash at the requested height. Then we check if anything in our txfilter matches the Neutrino filter.

Ideally, our fullnode-trusting bcli should also apply the txfilter to blocks it returns. This is to ensure that any bugs in maintaining the lightningd-side txfilter can be caught by our test suite without having to add SPV plugins of our own.

Unfortunately, to determine if an address is spent from, we would require a txindex, or have the plugin maintain its own transaction index. Both are obviously undesirable.

However, in practice:

We only care about addresses we receive from.
We only care about UTXOs that somebody spends from.

So let me propose instead passing in two arguments to getrawblockbyheight of bcli:

txfilter_receive_addr, an array of scriptpubkeys, which, if any of them receive funds in the given block, we should return the block.
txfilter_spend_utxo, an array of {'txid': 'xx', outnum: 42, scriptpubkey: 'xx'}, which, if any of the given UTXOs are spent, we should return the block.

Then:

The default fullnode-trusting bcli only needs to scan the requested block and every transaction:
- If a transaction input has a prevout matching a txfilter_spend_utxo entry, return the block.
- If a transaction output has a scriptPubKey matching a txfilter_receive_addr entry, return the block.
- Otherwise do not return the block, just its header.
The SPV interfaces, which are all address-based, just get the scriptpubkey from the txfilter_spend_utxo and append them to the txfilter_receive_addr.

This allows us to check in our test suite that lightningd is maintaining its txfilter correctly, using only the bcli plugin, while allowing address-based SPV interfaces to be used.

ZmnSCPxj commented 3 years ago

For now, a minimal scan of the chaintopology component suggests we keep track of the below:

Addresses we own, stored in a struct txfilter owned in struct lightningd.
TXOs we are tracking, stored in a struct txowatch_hash owned in struct chain_topology.
Transactions we are tracking, stored in a struct txwatch_hash owned in struct chain_topology.

Unfortunately, BIP158 does not include entire transactions: it only includes spends from and spends to specific scriptPubKey.

However, transactions-being-tracked is only used in two places:

lightningd/peer_control.c uses it to check for the funding transaction to confirm. It knows only the txid, but the funding TXO is known to have a specific scriptPubKey.
lightningd/onchain_control.c uses it to check for transactions being confirmed or deconfirmed (based on reorgs) and restart onchaind. It knows the entire transaction being watched.

So I think we can modify the watch_txid function and the struct txwatch structure to include an address that the transaction is funding.

We know the funding transaction has a funding TXO with a known scriptPubKey, so we can provide this.
Over in onchain_control we know the entire transaction, we can just extract any scriptPubKey from any output.

That way, our only interface absolutely necesary is a scriptPubKey that is spent to.

Of note as well, is that watch_txo is only used in the same places as well. We also need to modify watch_txo to add the scriptPubKey the UTXO has, which is known in both places as well.

ZmnSCPxj commented 3 years ago

Okay, I think getrawblockbyheight is just a bit too low-level.

Instead, I propose a new bcli method.

getfilteredblockbyheight height receive_scriptpubkeys spent_utxos

DESCRIPTION

getfilteredblockbyheight returns the block header, and any appropriate transactions, of the block at the given height.

receive_scriptpubkeys is an array of Bitcoin scriptPubKeys; each is a hexadecimal string.

spent_utxos is an array of objects. Each object contains the fields txid and outnum which indicate the UTXO being monitored, as well as a field scriptpubkey which is the scriptPubKey of that UTXO.

RETURN VALUE

This returns a result object with the following fields:

blockheader, a hexadecimal string representation of the block header.
txes, an array of hexadecimal string representations of transactions in the block.

This command returns null for the above fields if the backend has not reached that block height yet.

The txes array reliably contains only those transactions which spend from one of the given UTXOs in spent_utxos, or transactions which pay to one of the given scriptPubKeys in receive_scriptpubkeys. However, it may report more transactions than that (i.e. it may add transactions that false-positive match the given constraints). Clients of this method should not rely on anything other than the transactions that match being returned, and should be prepared to handle transactions that do not match the constraints.

By using the above, we can use just an Electrum Stratum connection. Electrum Stratum does not look like it can support downloading entire blocks, which is what getrawblockbyheight should return. But it can provide the history of a scriptPubKey i.e. all transactions that spend from and pay to that address, and that is why we should switch to the above interface instead.

Thoughts? @darosior in particular?

ZmnSCPxj commented 3 years ago

Hmmm so I was looking at our code as well, and it seems we have an existing bitcoind_getfilteredblock API as well.

As best as I can determine, this bitcoind_getfilteredblocks reports the TXOs created in the block which are P2WSH outputs and are still unspent now. It is used when querying the information from an SCID.

Electrum has a specific blockchain.transaction.id_from_pos which is perfectly designed for SCID queries. However, plain Bitcoin does not have a simple way of getting a transaction from a specific block at a specific index in that block. And neither can BIP158 Neutrino --- how the heck does Neutrino validate SCIDs??

Hmmm.

Now, in order to prevent too much activity from interacting with bitcoind, lightningd caches blocks and the P2WSH TXOs created in the block. This caching is of course done in the database. So if some SCID claims to have been created in a block lightningd has not downloaded, we go download it, checking only the P2WSH unspent outpoints. Then we store the block in the database, plus all P2WSH unspent outpoints, in the db, and then we check if the SCID is actually valid.

However, the above will not work well with Electrum, since we cannot simply download a block.

Further, the spentness of P2WSH outpoints depends on us processing every transaction in a block. We cannot just depend on data that involves only our own UTXOs and addresses!

Hmmmm. I suppose this needs more design thinking.

ZmnSCPxj commented 3 years ago

All right, let us add a new getutxobyscid. It returns the exact txid, amount, scriptPubKey, and the height was spent at (or 0 if it is not spent) and a found field that is true. If the SCID given does not match to any transaction outpoint, it just returns a found field that is false.

A backend plugin may choose not to implement getutxobyscid. It signals this by providing a getutxobyscid that always fails using a specific error code. If so, it must commit to implementing getrawblockbyheight.

The backend plugin getfilteredblockbyheight should also return a flag, all. If this flag is present and set, then the backend gave all the transactions in the block. If absent, or false, then the backend did not give all transactions in the block.

We also need to create a new db table, utxoset_blocks, with the same schema as blocks. When upgrading db, it just copies the blocks table.

The utxoset_blocks is those blocks which we know we have seen in full, i.e. not filtered as above.

Then, in gossip_control, if a gossip_get_txout message is received:

First, check if it is listed in the utxoset table.
- Need to modify wallet_outpoint_for_scid to report spendedness -- the current version returns failure if it is listed but marked as spent.
- If so listed and marked as spent, return failure to gossipd.
- If so listed and not marked as spent, check the utxoset_blocks table.
- If utxoset_blocks is filled in for every height from the SCID height to the max height, we know that we are tracking spendedness correctly, and we can return a success to gossipd.
- Otherwise, we should go ask getutxoset from bcli. If it returns success, we know that it is still unspent, so we can return a success to gossipd. If it fails, record the current tip height as the spendheight in the utxoset table (so future queries for the same SCID return quickly).
If the SCID is not listed in the utxoset table, check if the block is in the utxoset_blocks table, in which case we know the SCID is invalid and was never created anyway, so we can report failure to gossipd.
Try getutxobyscid bcli.
- If it succeeds and returns found as true, store the data into utxoset.
- Report success to gossipd if it is not spent yet, else report failure.
- If it succeds and returns found as false, report failure to gossipd.
If getutxobyscid fails with the "not implemented" error code, fall back to the existing bitcoind_getfilteredblock, which uses getrawblockbyheight.
- wallet_filteredblock_add should add blocks to the utxoset_blocks table.

Then, on normal synching at the block tip:

Use getfilteredblockbyheight.
- Add the returned block to blocks, just like now.
- If it returns with all as true, also add it to the utxoset_blocks table, and check for transactions that spend items in the utxoset.
- If it returns with all as false, it requeries getutxoset on all utxoset entries that are unspent, and if they are now spent, records the current block as the spending height.

Thoughts? @cdecker in particular since this is a massive change in chaintopology and wallet db. Sigh. The complication is in short-channel-ids.

darosior commented 3 years ago

Thoughts? @darosior in particular?

Supporting protocol that expose a less natural (scriptPubkey based instead of txid / txout based) interface would be great as it would allow to use more different backends than just bitcoind or esplora, such as for example the low-storage-cost electrs.
That said, I'm not sure it makes sense for us to support Stratum as we fetch a lot of outputs and such backends would be quite inefficient or don't work at all (EPS for example) as they are designed for a single onchain wallet (so there would just be the Python Electrum supported in addition to the Blockstream fork of electrs).

This comes down to the reason I did not go for this initially: efficiency. We moved from a getoutput-based (3 calls by output to bitcoind by channel ann) polling to get_filteredblock (1 call to bitcoind each reasonable amount of time) for more efficiency with gossip.
So moving back to a one-call-per-channel-ann would be kind of a regression unless the plugin does some sort of caching, and in which case it needs to cache the state to instantly answer to lightningd's getoutputbyscid.
This means more complexity, and what would have been small logic changes to lightningd (eg support a new Element hotness, by changing the parsing of the raw blocks received) needs to be replicated to all plugins.

There is still BIP158 which would be nice to have though. So this is trading (a lot of) complexity for BIP158 support: I have no strong opinion on this.

ZmnSCPxj commented 3 years ago

BIP158 does not have any specific support we can get for the short-channel-id-lookup case.

What we can do with BIP158 would be to continue with the current system, which is that lightningd requests for the raw block, and the BIP158 backend connects to a random fullnode and downloads the entire block. We store the transaction outputs that are P2WSH, and determine their spendedness. That means adding an advisory scriptPubKey argument to the getutxout command, which is ignored in bitcoin-cli plugin, but informs BIP158 backend to scan all saved filters for the scriptPubKey being spent.

But caching at the plugin level is probably needed indeed, especially for Electrum and BIP158 backends. Sigh.

Then a backend plugin could support a getutxobyscid. If this method is found we use it exclusively in gossip_get_txout requests and ignore the lightningd db, on the assumption the backend has implemented all the needed optimizations, in particular the Electrum backend has a specific command for such queries.

This means more complexity, and what would have been small logic changes to lightningd (eg support a new Element hotness, by changing the parsing of the raw blocks received) needs to be replicated to all plugins.

Maybe the block-synching command can be allowed to return multiple options. It gets a set of UTXOs and addresses the lightningd wants to monitor. Then it could return ONE of the following:

The non-existence of the block.
The exiistence of the block, but not any block data --- the block exists, here is the header, but no information other than the fact that none of the given UTXOs were spent and none of the given addresses received.
The raw block data, as with current getrawblockbyheight.
An array of transactions, some of which spent the given UTXOs and/or paid money to one of the addresses.

Thus, a plugin need not parse block data. However, the array of transactions return would be used by an Electrum backend, and presumably the Electrum server is the one that parsed the block for us, so the new Elements parsing thing need not be understood by the plugin either. BIP158 backends would just match the addresses to the BIP158 filters, and download the entire block as well too.

If the block-synch command returns:

raw block data

Then lightningd knows it can acquire accurate SCID-create and SCID-delete information for that block.

But if the block-synch command returns:

existence of block but not any block data
array of matched transactions

Then lightningd knows that it cannot get SCID-create and `SCID-delete information for that block.

So I think a new table of blocks that we have full knowledge of SCID events in that block should be maintained by us (i.e. the utxoset_blocks I proposed above).

If lightningd gets a gossip_get_txout, if a plugin implements getutxobyscid then we just depend on that. Else:

Check if the blockheight in the SCID is in the utxoset_blocks table. If it is, go to the next step.
- If it does not, we make it exist by doing getrawblockbyheight, which must exist. A Stratum backend can get away with not implementing this (or providing one that always errors) by simply implementing getutxobyscid.
Check if the SCID exists in the utxoset table.
- If it does not exist, or is spent, then we know that the SCID is no longer valid.
- If it exists and looks like it is unspent, we check if every block from that height to the tip is in the utxoset_blocks table. If so, that is sufficient for us to know that it is still unspent now.
- If some block is missing, we call another command (which might not be implemented), isutxospent. This includes the UTXO itself, but also includes an advisory array of blockheights that are missing in the utxoset_blocks table. from the known creation height of the UTXO. The BIP158 backend plugin can use that to limit its filtering to only the specified blocks. We can then update the entry in utxoset if it turns out that the UTXO is spent at one of those heights. There is a small false-positive rate here, in that the UTXO could end up false-matching the filter at that height, in which case we would reject an SCID needlesly, but hopefully that is a small enough effect on the network (the alternative would be for the plugin to download the block and parse it to check for UTXOs being spent in inputs, which is exactly what we want to avoid).

The upshot of this is that with a BIP158 backend, we can "spontaneously catch" some of the short-channel-id updates if it happens that a block matches the filter we request.

Aargh. The hard problem is the SCID lookup!

darosior commented 3 years ago

I think it's cleaner for us to make a (backward compatible) breaking change at this point.

Bitcoin backend, revived

Bitcoin logic in the Bitcoin backend plugins
- Allows for a lightningd agnostic of its Bitcoin backend, and thus a Bitcoin backend extensible without complexifying lightningd.
No more getrawblockbyheight.
- Use only the hash part for the topology update
- Use getoutput for requesting an output (no matter its nature, scid - wallet rescan - channel funding)
A "backend_version" field in the manifest sent by the plugin for compatibility.
- We can treat the current plugin as version 0, and specify "version": 1. If lightningd sees version 1, it uses the new protocol. This also means we can upgrade the bakend version later for e.g. notification support.
- It's a nasty hack (as it makes the backend plugins even more different, but they already arguably are) but it works (I think, needs to double check but sending garbage in the getmanifest does intentionally not make lightningd cringe).
For now, we just move the block parsing in bitcoind and cache blocks in RAM (we can have a, say, 10-blocks cache). We can optimize the behaviour afterwards, and in the meantime we can do some nice smart thing in alternative Backend plugins. We can also have a --neutrino startup option which translates in --disable-plugin bcli --plugin bip158_backend later.

AmkG commented 3 years ago

Use getblockcount for the topology update

We need to be aware of reorgs, especially those that take, say, 3 or 4 blocks, and which might have conflicting transactions affecting our wallet or channels, so I think this is mildly undesirable to just depend on a count rather than a header chain.

darosior commented 3 years ago

Yes, sure (I meant getrawblockbyheight but without the getrawblock part). Edited to be explicit.

Also, an even simpler upgrade mechanism: if a plugin registers getoutputwhatever then lightningd switch to the new behaviour, otherwise it fallbacks to parsing the raw blocks.

ZmnSCPxj commented 3 years ago

We also expose a getutxout method. This is used in the utxoset / scid tracking, but also used in the dev-forget-channel implementation to check the UTXO of the channel being forgotten is still unspent.

Now, for scid tracking, I believe what we are now planning is to let the plugin handle this by implementing a getutxobyscid and doing all the block parsing in the plugin.

However, for dev-forget-channel this is not appropriate. One use-case for dev-forget-channel is a funding transaction never confirming because weird blockchain reasons, so there is no scid. We can implement a getutxoutv2 that accepts an additional scriptpubkey argument.

Then getutxoutv2 is invoked only in dev-forget-channel (the only other current use of getutxout is in scid tracking, which we want to give over to the plugin side completely)..This requires matching the scriptpubkey to downloaded Neutrino filters, and (re-)download the appropriate blocks that match it to see creation and spend events. This is potentially an expensive operation, but if it is triggered only in dev-forget-channel then it should be acceptable.

Alternately we can just remove the dev-forget-channel checks, it could just check if the channel has an scid and if so it uses getutxobyscid, or if it does not have an scid, dev-forget-channel does not do any further checks. dev-forget-channel is a "big boy developer" command, so maybe acceptable?

ZmnSCPxj commented 3 years ago

@darosior @cdecker @niftynei @rustyrussell I have updated the original post with the current plan I have.

ZmnSCPxj commented 3 years ago

Hmmmm..

So I went digging about in gossipd, and I found that all messages of gossip_get_txout are triggered from channel announcements. And channel announcements also include both Bitcoin pubkeys. From the two Bitcoin pubkeys we can derive the funding script, from there the P2WSH hash, and from there the scriptPubKey. This would greatly help BIP158 backends! So I propose that getutxobyscid should also include the scriptPubKey expected at the given SCID position. This means we should also include the scriptPubKey in the gossip_get_txout message.

Another is the gossip_outpoint_spent message. Currently we just blast the message at gossipd at each new block for every P2WSH outpoint. But if we are monitoring these in the new backend API, we should add this outpoint to a filter set that we pass to the gettxesbyheight command. So I think we should make it explicit that gossip_get_txout also adds this to the set of outpoints being monitored on behalf of gossipd by the gossip_control, which will be informed as well bygossip_outpoint_spent`

(but the above risks reorg-related not noticing that a channel has been closed due to races between chaintopology and gossip_control: a brute-force way would be to remove gossip_outpoint_spent and have the gossipd periodically query with gossip_get_txout every SCID it thinks is a channel. Or we can go with both, make the rate of the brute-forcing requerying low (i.e. if we getroute, we can query the short-channel-ids of all the given channels after returning the route, so that at least if one of them turns out to have been closed under us then when the payment attempt fails and the getroute is re-attempted we will not re-propose the channel again in future payment attempts).)

ZmnSCPxj commented 3 years ago

Okay, so I started digging into actual code, and it looks like we cannot remove getutxout easily.

My plan was to remove this in favor of getutxobyscid. getutxobyscid is only guaranteed to work for P2WSH and Taproot outputs, because it is intended for channel gossip lookup.

The reason why we want to remove getutxout is because all SPV interfaces are address-based, but getutxout accepts a TXID-outnum.

However, there are a few places where we use getutxout to track the spendedness of our own onchain coins:

dev-forget-channel, we check that the channel funding outpoint is not spent yet.
at startup, we check our coins, and set them to spent if they are spent, available if not (wallet_clean_utxos).
dev-rescan-outputs similarly resets coins to spent or available.

The first, we could fudge: if it has an scid then the channel funding outpoint exists, and we can check with getutxobyscid.

The second and third, not so much, sigh.

Again, we cannot just keep the getutxout method, since SPV interfaces do not have a good way to implement this.

Perhaps we can add a checkspent command, which accepts an array of {'txid': 'xxx', 'outnum': 42, 'script': 'xxxx'} objects. Then it returns an array of the same size, with the same objects except with an additional 'spent' field that is either true or false. This command is what wallet_clean_utxos and dev-rescan-outputs use.

For bitcoin-cli backend, we just do the getutxout loop in the plugin instead of in lightningd.

For Electrum backend, we group the UTXOs by address and query the server for each address, then figure out spendedness.

For BIP158 backends, this is a lot more uncomfortable! Starting with the chain tip, we check if any UTXOs match in the Neutrino filter for the block. If it does, oh no, download the block and see if any of the UTXOs in the parameter input set is spent in that block. If one is spent, remove it from the parameter input set and mark it as "spent" and put it in our results set. Keep going backwards through the blockchain until the set of all UTXOs is known to be spent (i.e. the parameter inputs set is empty) OR we reach when_lightning_became_cool, at which point we just mark all the parameter input set as "unspent" and put them into the results set and return it. This makes it a bandwidth-intensive operation. This may be acceptable for dev-rescan-outputs as this is used for "recovery" situations, but may not be acceptable for wallet_clean_utxos.

Sigh.

EDIT

I realized that struct utxo, which is what wallet_clean_utxos and dev-rescan-outputs use, also includes a blockheight that is where the UTXO is known to be confirmed in. If so, we could potentially use this to limit how deep the BIP158 backend has to scan the blockchain. But in any case, an unspent UTXO will still trigger on the creation blockheight and trigger a block download in the BIP158 backend. We cannot omit this since a UTXO can be spent in the same block it is created, as well, so we have to download it.

In addition, at startup, wallet_clean_utxos can probably pre-filter the UTXOs it passes into checkspent, for example it can just consider only available and reserved UTXOs, or even just reserved UTXOs.

EDIT2:

The comment on wallet_clean_utxos says it only cleans up reserved UTXOs. In that case, we should take that comment to heart and only pass reserved UTXOs to checkspent.

ZmnSCPxj commented 3 years ago

Another complication is that getutxout calls into gettxout. gettxout has an optional third argument, include_mempool, which defaults to true if not specified, and bcli does not specify it. This means that getutxout also considers the mempool.

This is actually a further complication on our current reservation system with RBF (#3867 #3845 et al)!! In particular, suppose a transaction is completely made valid via signpsbt. Then it is broadcast directly (or another node broadcasts it via sendpsbt, but not our own). The outputs reserved by the original fundpsbt/utxopsbt remain reserved and you can utxopsbt again with allow_reserved to RBF it. However, if the node is restarted and the transaction is still unconfirmed, the outputs will now be marked as spent and you cannot utxopsbt again with allow_reserved.

In particular, that is how it would behave if the bitcoind we are talking to is not set to blocksonly. In blocksonly mode the bitcoind does not maintain a mempool and on restart we would still see the coins as unspent.

(Nowhere do we explicitly say that we do not support blocksonly!)

So if we are going to go RBF, we need to start being a lot more aware of the mempool as well, and aware that the mempool is not synchronized with miner mempools.

Maintaining a mempool greatly complicates a BIP158 backend (Electrum is fine since you can query the mempool of the Stratum server). A BIP158 backend cannot maintain a complete mempool since the point of an SPV wallet is to not maintain a complete UTXO set (by not downloading every block).

This means that a BIP158 backend has to know the UTXO set of at least our onchain wallet, so that it can filter out txes from the mempool that do not involve our UTXOs. So it looks to me that the backend plugin has to do UTXO set management itself, instead of lightningd... aargh.

Okay, so let us just have BIP158 maintain all transactions broadcasted by the lightningd as the "mempool". This probably means requiring that sendrawtransaction be replaced with a PSBT-using sendpsbttomempool instead, so that the backend plugin can go access the scriptPubKey of spent coins (so that a BIP158 "light mempool" can contain all the transactions broadcast by this lightningd, and then at least remove transactions from the "light mempool" that spend coins that are also spent in an incoming block). This still prevents a BIP158 backend plugin from reporting that a PSBT-based transaction spending our own funds was broadcasted by somebody else, though, at least until confirmation. Sigh.

darosior commented 3 years ago

Fwiw we do not support blocksonly as we require fee estimates.

Bitcoind has a quirk though where you could have outdated estimates if started in blocksonly... (i'e a pr for that)

-------- Original Message -------- On Jul 29, 2020, 06:11, ZmnSCPxj, ZmnSCPxj jxPCSmnZ wrote:

Another complication is that getutxout calls into gettxout. gettxout has an optional third argument, include_mempool, which defaults to true if not specified, and bcli does not specify it. This means that getutxout also considers the mempool.

This is actually a further complication on our current reservation system with RBF (#3867 #3845 et al)!! In particular, suppose a transaction is completely made valid via signpsbt. Then it is broadcast directly (or another node broadcasts it via sendpsbt, but not our own). The outputs reserved by the original fundpsbt/utxopsbt remain reserved and you can utxopsbt again with allow_reserved to RBF it. However, if the node is restarted and the transaction is still unconfirmed, the outputs will now be marked as spent and you cannot utxopsbt again with allow_reserved.

In particular, that is how it would behave if the bitcoind we are talking to is not set to blocksonly. In blocksonly mode the bitcoind does not maintain a mempool and on restart we would still see the coins as unspent.

(Nowhere do we explicitly say that we do not support blocksonly!)

So if we are going to go RBF, we need to start being a lot more aware of the mempool as well, and aware that the mempool is not synchronized with miner mempools.

Maintaining a mempool greatly complicates a BIP158 backend (Electrum is fine since you can query the mempool of the Stratum server). A BIP158 backend cannot maintain a complete mempool since the point of an SPV wallet is to not maintain a complete UTXO set (by not downloading every block).

This means that a BIP158 backend has to know the UTXO set of at least our onchain wallet, so that it can filter out txes from the mempool that do not involve our UTXOs. So it looks to me that the backend plugin has to do UTXO set management itself, instead of lightningd... aargh.

Okay, so let us just have BIP158 maintain all transactions broadcasted by the lightningd as the "mempool". This probably means requiring that sendrawtransaction be replaced with a PSBT-using sendpsbttomempool instead, so that the backend plugin can go access the scriptPubKey of spent coins (so that a BIP158 "light mempool" can contain all the transactions broadcast by this lightningd, and then at least remove transactions from the "light mempool" that spend coins that are also spent in an incoming block). This still prevents a BIP158 backend plugin from reporting that a PSBT-based transaction spending our own funds was broadcasted by somebody else, though, at least until confirmation. Sigh.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

ZmnSCPxj commented 3 years ago

Hmmm but you can replace plugins, right? You could have a plugin which runsh bcli in background and forwards all methods except estimatefees, which instead downloads from johoe.

Naaaa that is dumb. We should explicitly document that we do not support blocksonly.

darosior commented 3 years ago

Hmm even better: we could make a getnetworkinfo call at startup to check if we relay txs, and display a nice error if not. I'm going to PR that.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Le mercredi, juillet 29, 2020 1:54 PM, ZmnSCPxj, ZmnSCPxj jxPCSmnZ notifications@github.com a écrit :

Hmmm but you can replace plugins, right? You could have a plugin which runsh bcli in background and forwards all methods except estimatefees, which instead downloads from johoe.

Naaaa that is dumb. We should explicitly document that we do not support blocksonly.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

ZmnSCPxj commented 3 years ago

Now that you mention mempools and feerates.... how do BIP158 SPV nodes guess reasonable feerates?? Sigh. Because if the BIP158 SPV node is receiving arbitrary transactions into its mempools, that is actually much of the same bandwidth as a fullnode (after IBD, at least), since modern fullnodes use compact blocks.... whut. How would a BIP158 SPV backend get fee estimates? (I am tempted to just say johoe...).

Saibato commented 3 years ago

BIP158 SPV backend get fee estimates? (I am tempted to just say johoe...

Now it gets interesting, i am always from the quantum aspect of reality in favor of pure randomness. I see not the point that its important to regulate or compute everything to the least bit.

To show a decent will in the absence of determined knowledge is often enough to push things forward in the desired direction.

Just dice a bit. And RBF the rest?

ZmnSCPxj commented 3 years ago

RBF is difficult for us to handle in fundchannel case: we need a stable txid. We could go with CPFP+RBF, so that fundchannel txid is stable, but that means we cannot do a funchannel "all" to put all available funds into a channel, since there would be no way to CPFP+RBF the resulting txid.

And until we fully implement anchor commitments, we need a decent feerate estimation for update_fee message, since we cannot CPFP the unilateral close transaction (we can only CPFP+RBF it once we implement anchor commitments fully). I guess it is an argument to go anchor commitments.... sigh.

Saibato commented 3 years ago

yup, we are with CPFP ,RBF or good estimation in the hand of the miners and signed tx have ever only a probability we like to influence to find its way out of mempool and then to not get reorged out.

Its a bit like programming Grover like oracles.

That's my overall reasoning to dice at some point and let s** happen, instead of central point of johoe...,

anchor commitments, hmm I process mempool message please wait.., please wait.., please wait .. ? At some point you have to dice again to stop the wait loop or I guess you have thought about that, but i have not present how you mitigate this? And also we have howto calc closing fee in spec with a MUST not SHOULD so sigh sigh ...

ZmnSCPxj commented 3 years ago

Okay, here is a horrible idea for fee estimation with a BIP158 backend:

Randomly sample txes from mempools of peers.
- Bitcoin peers that maintain mempools should be able to give inv for txids in their mempools, if my understanding of the P2P protocol is correct.
- Make sure the size of downloaded txes is low, because if we start downloading too many txes from peers our bandwidth starts to approach fullnodes and we might as well just run a fullnode.
Sort the randomly-selected txes by feerate.
Guess some "median" to use, based on the number of txes in the mempools of peers.
- The larger the mempools of peers, the more we bias our "median" selection towards the high end. What conversion factor do we use? No idea.

darosior commented 3 years ago

Bitcoin peers that maintain mempools should be able to give inv for txids in their mempools, if my understanding of the P2P protocol is correct.

FWIW you could also use the mempool p2p message (if i remember its name correctly), but in this case you... basically trust your peers with your security model (timely confirmation of transactions) ?.. I think it's a terrible idea, and in any case a SPV lightning node needs to trust some "one" to give it reliable fee estimates so i'm not sure it should be shipped by default ?

ZmnSCPxj commented 3 years ago

I think it's a terrible idea, and in any case a SPV lightning node needs to trust some "one" to give it reliable fee estimates so i'm not sure it should be shipped by default ?

SPV is certainly not something I would consider as "safe to ship by default", even if we had a perfectly trustless method of fee estimation.

FWIW even a tx-broadcasting fullnode will be asking all its peers for all txes in the peer mempool anyway, and thus trusting them to not lie about their mempools; we are simply reducing the bandwidth use by doing a random sampling that we assume to be representative of the entire data set, which is good enough for scientific papers anyway so should be good for us here in engineering-land (yeah right).

And if we can fully move to anchor commitments, then we can get away with a lousy lowball initial estimate and just CPFP+RBF later.

For example, a withdraw or fundchannel transaction might have an anchor output (but only a single one, controlled by us) if there is no change output, and we can then later CPFP+RBF via either a change output, or an anchor output, if the initial fee estimate turns out to be too low.

ZmnSCPxj commented 3 years ago

lightningd is schizophrenic. We have struct txwatch which watches specific transactions, struct txowatch which watches txos, which we seem to use primarily for onchaind and peer_control; these components do not have their own sub-object, but instead use struct lightningd. Then we have struct txfilter, which we use to look for receives into the onchain wallet, and is stored in struct lightningd. We have struct outpointfilter (distinct from struct txowatch!) that we use to look for spends from our wallet, but is stored in struct wallet. Finally, gossip_control needs to inform about channel spends to gossipd so that gossipd can destroy channels from the routemap, but it just implicitly trusts that every block will be downloaded and passed through it.

We currently get away with this because we always get every transaction in every block, so even if our filters are schizophrenic it does not matter.

Anyway, the changes needed are:

txowatch needs no changes.
txwatch needs to provide a scriptpubkey that the transaction pays out to, or an input the transaction spends. The watch_tx interface needs no changing (it can extract either from the given bitcoin_tx) but the watch_txid needs one or the other. A scriptpubkey paid to by the transaction seems to be best; txwatch is used by channel funding to determine if we have locked in. Still have to study how the "unconfirmed closeinfo utxos" work though.
txfilter needs no changes in interface. However we need a way to easily collect all scriptpubkeys stored in it; the hash table comments say iterating over it isn't efficient, but shrug maybe it is not too bad?
outpointfilter needs to include a scriptpubkey for the utxo as well.
gossip_control needs to monitor and track utxos which succeed in gossipd get_txout queries, and include those UTXOs when passing filters to gettxesbyheight.

Another thing to consider is to add a waitblockheightbackend. For Electrum backends we can subscribe to header chain updates to complete calls to waitblockheightbackend; this is more bandwidth-efficient since Electrum clients do not spam the server with repeated queries about the blockheight, the Electrum server does the polling in its local bitcoind (which is presumably faster since it is connecting locally). bitcoin-cli backends can poll getblockcount. BIP-158 backends would periodically poll peers, or if somebody (a miner?) force-pushes a block at us then it wouldn't need to poll at all.

The issue is back-compatibility to the v1 backend interface. Ideally the backend plugin should just implement waitblockheight directly, but if we are using a plugin with v1 backend, we need to implement the polling locally and then waitblockheight on top of that. To emulate waitblockheightbackend for v1 we would need to poll getrawblockbyheight, but since the chaintopology loop will do a large waitblockheight-gettxesbyheight loop, we should cache the getrawblockbyheight result, else alternative backend v1 plugins might get repeatedly spammed by getrawblockbyheight. Hmm.

ZmnSCPxj commented 3 years ago

Add pruning label.

The reason we cannot work well with pruning is that we need historical information on channel opens, especially on a fresh install which knows nothing about older channels. Without pruning we just look up the old blocks and check the channel exists. With pruning we cannot validate historically long-lived channels, and cannot send it as gossip and cannot rely on it for routing.

A possible solution for pruning would be for us to process blocks at the block tip as usual from the fullnode, but for historical channels below the pruning cutoff, ask a server (e.g. Electrum servers have a decent interface for querying historical channels by SCID).

This is still less desirable than running a non-pruning fullnode, since you are trusting that the server does not lie to you for map data, and if you are e.g. evaluating map data for where best to make channels (i.e. autopilot) then the server can lie to you and bias your selection towards surveilling nodes.

cdecker commented 3 years ago

The best a server can do is to not return a block, if we verify the Merkle Root and the blockhash against the local pruned node, which still maintains a list of all valid blockheaders in the past. Refusing to return a block is a pretty obvious failure, so I feel pretty good about anchoring our trust in a pruned node, and using the server to get the details in a tamper proof manner 🙂

ZmnSCPxj commented 3 years ago

I was thinking in terms of Electrum blockchain.transaction.id_from_pos, which would be fairly cheap compared to downloading an entire block (though admittedly downloading an entire block is cached on lightningd so we can afterwards validate every scid in it). But it seems that the same validation you mention can be done as well with that. blockchain.transaction.id_from_pos returns a txid and optionally a merkle tree proof, we can validate the merkle tree proof against our local pruned node block header.

However, with the Electrum blockchain.transaction.id_from_pos interface, the server can claim that the given transaction index is out of range (i.e. the block is an empty block), and cannot provide a proof of this assertion. What a node can do would be to binary-search the transaction index until it can locate the last transaction index that the server will admit as existing on that block, validating the merkle tree proof of every "found" transaction, and checking that the last transaction is indeed at the "rightmost" position of the merkle tree (I think this is possible, my naive reading of BIP180 suggests it is always possible to prove how many txes are in a block, it is only that the Electrum interface does not return that proof, but the details of the merkle tree might prove me wrong). Or maybe just download the entire block at that point, that seems like a lot of work and round trips.