Bitcoin wallet tracker (bwt) client

notmandatory commented 4 years ago

bwt is a lightweight wallet xpub tracker and query engine for Bitcoin, implemented in Rust.

Implement block data source trait for bwt server REST API.

depends on bitcoindevkit/bdk#84

justinmoon commented 3 years ago

We'd probably want to directly connect to Bitcoin RPC. Just implement whatever bwt functionality that we need. I have a WIP PR that implements some of this: https://github.com/bitcoindevkit/bdk/pull/22. So I think this could be merged with https://github.com/bitcoindevkit/bdk/issues/79.

notmandatory commented 3 years ago

I agree, it looks like bwt wouldn't be a good fit since you need to start it up with the xpub(s) you want to track and then it only offers a REST API with similar functionality to the RPC api.

I'm closing this issue and if BWT evolves into something less statefull we can reconsider reopening it.

shesek commented 3 years ago

you need to start it up with the xpub(s) you want to track

Does having to configure this when starting up complicate things? I could make it possible to track additional ones in runtime if that helps.

offers a REST API

In addition to the HTTP REST API, bwt also offers an Electrum RPC server, which bdk could connect to using the existing Electrum backend implementation.

Alternatively, it can also be initialized and accessed directly from Rust. You can see an example of that here. The Query API is documented here. One thing that's still not easily doable with the Rust API is subscribing to real-time events, but implementing that would be pretty straightforward. (real-time updates are currently only available via the Electrum RPC subscription mechanism, via HTTP Server-Sent-Events, or using WebHooks.)

if BWT evolves into something less statefull

I'm not sure what you mean; bwt is entirely stateless and has no persistent storage of its own. All the state is managed within Bitcoin Core.

--

One thing that's missing in bwt is support for tracking script descriptors. There's an open PR for this by @justinmoon that I've been meaning to push forward but still haven't (sorry that its been taking so long justin!). Knowing that bwt might find use in bdk if it supported descriptors would definitely give me some motivation though. :-)

notmandatory commented 3 years ago

hi @shesek, thanks for the feed back, bwt is certainly a cool project so I'm happy to reopen this issue to continue the discussion on how we can work together.

What I meant by bwt being stateful is that, as I understand it, when you startup a bwt server you need to tell it which xpubs to track, and then it keeps indexes to track utxos for addresses generated from only those xpubs, and returns corresponding data stored in your full node. But with an Electrum server, or calling core RPC directly, there's no server setup, the wallet client just needs the server's URL, and you can ask it for whatever info those APIs can provide. But I see the advantages with the bwt approach, that your indexing can be much smaller and faster since your node is only tracking what your wallet cares about. It's just a different model then how existing bdk blockchain modules work.

Probably the best first step is, as you mentioned, with https://github.com/shesek/bwt/pull/37. With this PR it sounds like a basic bdk + bwt wallet would work something like this:

user generates a new bdk based wallet xprv descriptor
bdk wallet also generates and displays the corresponding xpub descriptor
user configures the bdk wallet with their bwt server URL and protocol (electrum RPC or [future] bwt REST)
user configures their bwt server with the xpub descriptor displayed by bdk wallet

Having a runtime bwt API to add new descriptors to track would make the wallet/server setup experience easier, but it probably more of a nice to have feature right now.

We can use this bdk issue to track testing a bdk wallet with bwt via electrum RPC and possibly also implementing the bwt REST client api if there's interest in that. How does that sound?

shesek commented 3 years ago

as I understand it, when you startup a bwt server you need to tell it which xpubs to track, and then it keeps indexes to track utxos for addresses generated from only those xpubs

Yes, you understand correctly (though the indexes also track transactions, not just utxos). But this is inherent if you're not keeping a full history index for all addresses, not something unique to bwt.

or calling core RPC directly, there's no server setup, the wallet client just needs the server's URL, and you can ask it for whatever info those APIs can provide

The core RPC can't give you the transaction history of arbitrary addresses, you'll have to first import the addresses and do a rescan, which is what bwt does internally. The only way that there'll be no server setup is with a full address index (which core doesn't have).

We can use this bdk issue to track testing a bdk wallet with bwt via electrum RPC and possibly also implementing the bwt REST client api if there's interest in that. How does that sound?

Sounds good! Though I probably wouldn't bother with the REST API for now if you can already get everything you need via the Electrum RPC.

shesek commented 3 years ago

I tried running bdk against bwt, and found an issue that is now fixed (bwt was returning an error from blockchain.scripthash.listunspent if the scripthash has no history, instead of returning an empty set).

With this fix in place, I was able to sync bdk as follows:

$ bitcoind -regtest -datadir=/tmp/bd
$ bwt --network regtest --bitcoind-dir /tmp/bd --xpub vpub5SLqN2bLY4Wea2jFkKNCZUzwRAsAGixRfD4PFq5NJ4zUaQBejH13Km2wQynyL7DXyuVU37cj2qb6XpBtSGsQ9tGPv6z34MZf4MXULEqbPRM
$ repl --descriptor "wpkh(tpubD6NzVbkrYhZ4YRJLsDvYBaXK6EFi9MSV34h8BAvHPzW8RtUpJpBFiL23hRnvrRUcb6Fz9eKiVG8EzZudGXYfdo5tiP8BuhrsBmFAsREPZG4/0/*)" --network regtest --server tcp://127.0.0.1:60401 sync

Note that I had to convert the xpub into its zpub equivalent to get bwt to use p2wpkh. This won't be necessary once descriptor support lands in bwt (it'll accept a --descriptor param similar to bdk's).

--

As an aside, why does bdk use blockchain.scripthash.listunspent? The electrum wallet itself never calls it (which is why I didn't catch that bug earlier), instead it determines the wallet's utxo set based on the transaction history retrieved with get_history, which seems like a more reliable way to do this. (otherwise, the wallet might get into a state where the utxo set and the tx history are out-of-sync and disagree with each other -- so for exmaple you could have a utxo without having the tx funding it in the history, or have a tx in your history that creates a utxo which isn't in your utxo set)

notmandatory commented 3 years ago

@shesek we discussed your question today at our team meeting and didn't find a particular reason the bdk electrum client is using listunspent instead of get_history to get the wallet's utxo set. I created an issue #140 to investigate it further and possibly change it to use get_history.

shesek commented 3 years ago

So I played a little more with bdk/bwt and ran into another incompatibility^*. bdk fetches the previous transactions of all the inputs used in wallet transactions (to calculate fees), while bwt provided wallet transactions only (transactions that fund or spend from wallet spks directly). I've made the necessary changes so bwt could provide non-wallet transactions too.

But this requires txindex and is incompatible with pruning. Bitcoin Core keeps around wallet transactions in its wallet db even if the block containing them was pruned, but not their parents. Leveraging the wallet db allows bwt to be pruning-compatible (as long as the addresses were imported into core before the relevant transactions got pruned), but this breaks that property.

This also makes bdk's database quite a bit larger and requires more bandwidth – at least 2x for every incoming transaction, more if they have >1 input. A batched withdrawal tx from an exchange could easily be 200x.

For bwt, I reached the conclusion that fees are only really interesting for mempool transactions, which allowed me to sidestep the complications of figuring out fees with pruning entirely (for mempool transactions where all the inputs are in the utxo set, its easily available via bitcoind's getmempoolentry).

Could this be reasonable for bdk too? The electrum rpc protocol, bwt's api and esplora's api all provide fee info (in the case of the first two, only for mempool transactions. esplora provides it for confirmed transactions too) which bdk could use.

The downsides is that the fee information is less reliable (requires trusting the server, which may be acceptable, especially if its a bitcoind/bwt server operated by the user himself), and that the parent transactions might be necessary later for PSBT (though maybe it could fetch that in real time when needed, instead of keeping everything in persistent storage?).

* The reason I didn't run into this in my previous test was that I was using the default bitcoind wallet, which was also used to send funds to the tpubD6NzVb xpub, and therefore had the parent transactions (the mining reward coinbase in this case) in its wallet db from the sender's side.

afilini commented 3 years ago

There's actually a flag in bdk which I think is called AccurateFees that a Blockchain backend can advertise if it can compute the fees of all the incoming/outgoing transactions. One option could be to try and fetch the inputs, and if the electrum server (in this case bwt) throws an error we could just skip trying to compute the fees and not advertise that flag.

AccurateFees is not really a great name, but as always I suck at naming things, so if you have better ideas feel free to propose them.

shesek commented 3 years ago

One option could be to try and fetch the inputs, and if the electrum server (in this case bwt) throws an error we could just skip trying to compute the fees and not advertise that flag.

Wouldn't that be better served by not having the GetAnyTx capability when the backend is a pruned or txindex-less bitcoind node, instead of trying and failing?

There's actually a flag in bdk which I think is called AccurateFees that a Blockchain backend can advertise if it can compute the fees of all the incoming/outgoing transactions.

I would consider separating AccurateFees into something like MempoolFees and HistoricalFees.

A BitcoindBlockchain backend would only support MempoolFees[0], but might also have GetAnyTx which allows to determine fees by explicitly fetching parent transactions.

Same goes for the ElectrumBlockchain backend, which is typically MempoolFees+GetAnyTx[1], but might not have GetAnyTx if its a personal eps/bwt server backed by a pruned/txindexless node.

The EsploraBlockchain backend would be HistoricalFees+GetAnyTx.

[0] technically bitcoind does provide the fee amount for outgoing wallet transactions in its wallet db, but not for incoming ones. And its not entirely reliable, because transactions with a mixture of wallet and non-wallet inputs result in incorrect fee amounts.

[1] the electrum rpc protocol doesn't include fee for confirmed transactions.

afilini commented 3 years ago

Wouldn't that be better served by not having the GetAnyTx capability when the backend is a pruned or txindex-less bitcoind node, instead of trying and failing?

Yes, my point is that it's not always trivial to detect whether a backend supports some capabilities. For instance, if I connect to an Electrum server I don't know if it can fetch any tx i want and the only way to find out is probably to try and see if it fails. Maybe we could try with some specific historical txs (like the coinbase of the genesis block), that we know are present in the blockchain and if we get a "not found" we know that the server is a "personal server", like bwt or EPS.

This could also be part of the "constructor" of the blockchain backend, so that we don't have to wait for a first sync() before we actually find out what our backend can do.

With bitcoind it's easier because if you have access to the RPC you can make a call to determine whether the node is pruned or not.

I would consider separating AccurateFees into something like MempoolFees and HistoricalFees.

Makes sense, I like this idea.

shesek commented 3 years ago

Yes, my point is that it's not always trivial to detect whether a backend supports some capabilities.

In my mind this was a user configuration passed in when starting bdk up, maybe with something like --server electrum+tcp://127.0.0.1:60401/?txindex=0.

But yes, detecting this automatically would make things easier for users, so this might be a better option.

With bitcoind it's easier because if you have access to the RPC you can make a call to determine whether the node is pruned or not.

It'll need to check for txindex, which I think isn't easily available via the rpc and also requires trying something that needs it and seeing if that fails.

shesek commented 3 years ago

I have a branch for bwt with descriptor-based tracking. It now accepts a --descriptor argument, and uses descriptors internally when the user provides an xpub/ypub/zpub via --xpub.

This is how the descriptor-powered api looks like: https://gist.github.com/shesek/bf357a55c968635d34adfd171d0c65b1

Implementing a blockchain backend for bdk based on bwt's http/rust api would be a breeze compared to the electrum backend, but since the electrum backend implementation already exists its probably easier to just reuse that.

--

As an aside, this commit might be of interest to you - a 60x optimization for deriving child addresses from simple p2*pkh ranged descriptors, by special-casing them and using the underlying ExtendedPubKey directly instead. -- there's an easier fix for this

shesek commented 3 years ago

There are a few dependencies of bwt that I could quite easily make optional. chrono, dotenv, pretty_env_logger and structopt are all only really needed for CLI use, I could put them behind a cli flag. hex I can just implement locally. and dirs is used for auto-detecting the .bitcoin directory location, which isn't strictly necessary.

Edit: Created a tracking issue for this at https://github.com/shesek/bwt/issues/61

Update: structopt, dotenv, dirs, pretty_env_logger and signal-hook can now be avoided when using bwt as a library. hex was removed entirely.

The remaining non-optional ones that bdk doesn't also depend on are bitcoincore-rpc, anyhow, thiserror, lazy_static and chrono. Making the last two optional too is TODO.

RCasatta commented 3 years ago

As an aside, this commit might be of interest to you - a 60x optimization for deriving child addresses from simple p2*pkh ranged descriptors, by special-casing them and using the underlying ExtendedPubKey directly instead.

I think soon rust-miniscript will be faster, by passing borrowers secp context around https://github.com/rust-bitcoin/rust-miniscript/issues/163

shesek commented 3 years ago

@RCasatta Yes, you're entirely correct. Sanket just pointed this out to me yesterday. With a fix for this in place, the address generation times goes down to be barely noticeable compared to plain xpub derivation, maybe 0.1 extra seconds or so for 1500 addresses. Which makes my special-cased p2*pkh optimization basically useless :)

bitcoindevkit / bdk

Bitcoin wallet tracker (bwt) client #88