Closed ghost closed 6 months ago
Mh at some point I think someone will propose something, but still looking for a well-tested plugin
I am writing one https://github.com/coffee-tools/folgore with failure recovery too, but it is not so mature to run as a default plugin in cln. it is fetching also the block from the network with nakamoto.
@nepet is working on integrating the recently added getblockfrompeer
(Bitcoin Core) into the bcli
plugin. While doing so he found a couple of rough edges around that feature, and so he's discussing with the Bitcoin Core devs on how to improve it. So this is being worked on, but the timeline is not quite clear yet.
I'll let @nepet add color to this, and he will likely have more information on the progress and dependencies.
uh this is interesting! Nice!
Yeah, it's a new JSON-RPC call in bitcoind
that fetches the block, but it doesn't return it outright, instead adding it to the block store on disk, and then we fetch it with a normal getblock
. However, this is racy, since if the pruning kicks in inbetween, we wont get the block despite just having fetched it.
Oh I see the @luke-jr is working on it in https://github.com/bitcoin/bitcoin/pull/19463 maybe I will help to review it to move this forward
Thanks for describing it btw
LND's solution is to bypass the bitcoin backend by directly querying peers for a missing block. This involves peer management, serialisation, block validation and sanity checks, which is a bit of reinventing the wheel and introduces possible attack vectors if done incorrectly.
The solution I am working on uses the getblockfrompeer
api that was recently added to Bitcoin Core. The beauty of this is that all the sanity and validity checks are done for us by Bitcoin Core. However, this comes with some caveats (which I am discussing with the Bitcoin Core developers). One of the caveats is the reliability of the call, as @cdecker mentioned above. Another is that the call is asynchronous and only returns an empty json on "maybe success", while the block may appear on our local peer at some point in the future.
Fixing these may well take some time, but in the meantime using getblockfrompeer
does not hurt and the probabilities are still high that we will eventually get the block if we just retry.
In conclusion, this will be a multi-step improvement, where the first step that uses getblockfrompeer
in an "unreliable" way may end up in the next release, while we try to fix the reliability in a second step later on.
I have not seen a "maybe success" empty json yet, however calling getblockfrompeer
and then retrying getblock
works perfectly fine for me.
I have three VPS nodes running my script on mainnet that trigger a getblockfrompeer
on a random peer. I played around with an optional sleep after triggering, then call getblock
again to see if the block has been downloaded in the meantime. With a sleep of ~7s I could make all UNUSUAL entries for getblock
in cl.log
go away. On lightningd side, the getblock
call has a 10s timeout before it retries afaik.
I have them running for multiple days and they downloaded >10000 blocks, so far lightningd seems happy and works fine. Multiple parallel downloads of the same block from different peers handles bitcoind very well and just returns the block from whoever has finished fetching the block first.
So, probabilty 100% with the retry mechanism at least on my side.
lnd detects pruned nodes and downloads blocks, if bitcoind has them already pruned:
https://github.com/btcsuite/btcwallet/blob/5df09dd4335865dde2a9a6d94a26a7f5779af825/chain/bitcoind_conn.go#L474
Theoretically CLN supports pruned nodes as well, but when I tried it with ~60 GB block data, it usually doesn't take long until CLN asks bitcoind for blocks that have already been pruned.
In this case, CLN just retries to get the block every second without success:
I wrote a python script that downloads the blocks for CLN (https://github.com/martinneustein/pyBTCProxy), but I wonder if this might be a useful functionality to go into CLN.