bcoin-org / bcoin

Javascript bitcoin library for node.js and browsers
https://bcoin.io
Other
3k stars 813 forks source link

watch-only wallet selects locked coins when creating TX #906

Open christsim opened 4 years ago

christsim commented 4 years ago

It appears that bcoin sometimes selects inputs that have already been spent.

So far this has only happened twice on our production environment. The "steps" to reproduce seem to be the following:

  1. Create an unsigned transaction
  2. Lock the inputs of the transaction
  3. Sign and then broadcast the transaction
  4. Create the 2nd transaction (this seems to be happening at the exactly the same time as the broadcast of the 1st transaction
  5. The inputs the wallet has selected are the same as the 1st transaction

Here are are log statements:

I 2019-11-07T14:52:40.722603410Z info: Creating unsigned tx with options: {"account":"btc-2000","rate":45383,"smart":false,"subtractFee":false,"subtractIndex":-1,"depth":1,"outputs":[{"address":"1JZJ...","subtractFee":false,"value":56097}],"sign":false}

I 2019-11-07T14:52:41.205263495Z info: LOCKING_INPUT|coin[btc]|walletId[btc-wallet-01]|hash[e985b0....]|index[7]

I 2019-11-07T14:56:00.727048937Z info: Creating unsigned tx with options: {"account":"btc-2000","rate":45383,"smart":false,"subtractFee":false,"subtractIndex":-1,"depth":1,"outputs":[{"address":"1DH...","subtractFee":false,"value":20460000}],"sign":false}

I 2019-11-07T14:56:00.825578544Z info: BROADCAST_TX|user[vwapikey|btc-signer-1]|accountId[btc-2000]|sendId[a773fa5b-fe1a-470d-88d7-be504aa6a7f2]

I 2019-11-07T14:56:00.858349561Z info: pending_send: btc, send, [1JZJ...](null) 56097, tag[undefined]

I 2019-11-07T14:56:01.220103939Z info: LOCKING_INPUT|coin[btc]|walletId[btc-wallet-01]|hash[e985b0...]|index[7]

As you can see with the final log statement, the same hash and index has been selected for the second transaction which eventually results in the transaction being rejected - and btw I don't see any errors from bcoin for this invalid transaction.

Also take note on the timestamps, the first transaction was created around 3 minutes before the second transaction, and the broadcast happened at the same time as the second transaction was created.

The pending_send log statement happens when the wallet receive a tx event

   wallet.on('tx', async (tx) => { ...  });

We are creating unsigned transaction using:

const {
    NodeClient,
    WalletClient
} = require('bclient');

const wallet = this.getWallet(walletId);
const tx = await wallet.createTX(options);

Where the options are something like this:

{"account":"btc-2000","rate":45383,"smart":false,"subtractFee":false,"subtractIndex":-1,"depth":1,"outputs":[{"address":"1JZ....","subtractFee":false,"value":56097}],"sign":false}

We are then locking the all the inputs (tx.inputs) of that transaction using:

await wallet.lockCoin(input.prevout.hash, input.prevout.index);
christsim commented 4 years ago

The bcoin documentation says that smart: true will select unconfirmed transactions which I don’t want.

On Sat, 09 Nov 2019 at 14:20, Matthew Zipkin notifications@github.com wrote:

Why are you setting smart: false? That affects coin selection.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bcoin-org/bcoin/issues/906?email_source=notifications&email_token=AGX5OM4HSXPD2LBT7NR6SELQS2TJBA5CNFSM4JLEG222YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDUEWXI#issuecomment-552094557, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGX5OMZBNILCCVMYNVMBN4TQS2TJBANCNFSM4JLEG22Q .

pinheadmz commented 4 years ago

Yeah thats correct sorry. And more specifically, smart: false excludes locked coins:

https://github.com/bcoin-org/bcoin/blob/7bc32515467d6a48d1b6473d1224d689efe907fa/lib/wallet/wallet.js#L1124-L1128

pinheadmz commented 4 years ago

Just to confirm, the signing wallet is completely separate from the watch-only wallet and the locked coins? Meaning, even in your test case, you didn't create the hot and cold wallets in the same walletDB?

pinheadmz commented 4 years ago

I'm having trouble recreating this behavior, but as you said, it's not a reliable error which makes me think it's got to be a race condition in the application layer.

I wrote this test script: https://gist.github.com/pinheadmz/aab2799420e1e0b1a59293285b6e9231

By commenting out some of the steps you can observe different outcomes. There are two cases in which the second createTX fails with "not enough funds":

  1. The only spendable coin is locked

  2. The change address from the first tx is not yet confirmed (because of smart: false)

So far I haven't been able to observe a locked coin being reused in the second create TX.

pinheadmz commented 4 years ago

I did just notice something about your logs though: you seem to have two wallets:

walletId[btc-01]|hash[e985b0....]|index[7]

walletId[btc-wallet-01]|hash[e985b0...]|index[7]

The set of locked coins is saved in memory in a txdb object, of which there is only one per wallet. Is it possible you are locking the coins only in the first wallet and not the second?

christsim commented 4 years ago

Just to confirm, the signing wallet is completely separate from the watch-only wallet and the locked coins? Meaning, even in your test case, you didn't create the hot and cold wallets in the same walletDB?

Correct. We have one wallet and one account. We create the transaction and then lock the inputs on the watch only wallet. Then we sign the transaction on another wallet which contains the private keys before bringing the signed transaction back to the watch only wallet for broadcasting.

christsim commented 4 years ago

I did just notice something about your logs though: you seem to have two wallets:

walletId[btc-01]|hash[e985b0....]|index[7]

walletId[btc-wallet-01]|hash[e985b0...]|index[7]

The set of locked coins is saved in memory in a txdb object, of which there is only one per wallet. Is it possible you are locking the coins only in the first wallet and not the second?

Sorry, my bad. Must have deleted it by mistake when i was scrubbing the transaction hash. They should both be btc-wallet-01

christsim commented 4 years ago

FYI. Just happened again. The locking of the inputs happened at ~exactly the same time as the broadcasting of the transaction previous transaction:

2019-11-22T19:03:28.787737108Z info: LOCKING_INPUT|coin[btc]|walletId[btc-wallet-01]|hash[df78f97c...]|index[0]
2019-11-22T19:03:28.710113772Z info: pending_send: btc, send, [3EuW....](null) 81000000, tag[undefined]
2019-11-22T19:03:28.660009582Z info: BROADCAST_TX|user[vwapikey|btc-signer-1]|accountId[btc-warm-2000]|sendId[035aa89b-6147-45b7-b92f-30de073c9b52]
...
2019-11-22T19:01:48.733944234Z info: LOCKING_INPUT|coin[btc]|walletId[btc-wallet-01]|hash[df78f97c...]|index[0]

Unfortunately I still can't reproduce in our test environment.

pinheadmz commented 4 years ago

So I think I see how a race condition could exist here, and actually it looks like it could happen with createTX accidentally using spent coins, the same way in this case createTX is accidentally using locked coins. The reason I think is because there is no shared lock in the wallet around the two functions that read and write coins in txdb.

When you create a TX (with smart: false) , wallet.fund() calls txdb.getCoins() and then filters them with txdb.filterUnlocked() -- with a "fund" lock:

https://github.com/bcoin-org/bcoin/blob/3623d04a9fc0ea1ad223597b2be941c07d16e0e9/lib/wallet/wallet.js#L1093-L1100

If at that same moment, a transaction is broadcast or received by the wallet, the walletDB will call txdb.insert() -- with a "write" lock:

https://github.com/bcoin-org/bcoin/blob/3623d04a9fc0ea1ad223597b2be941c07d16e0e9/lib/wallet/wallet.js#L1889-L1896

Inside txdb.insert() a bunch of stuff happens but especially these two things:

  1. coins are marked as spent and removed from the credits bucket in the DB

  2. coins are removed from the "locked coins" list

So...

Just looking at the above, I actually kinda wonder why we haven't seen a bug where wallet.fund() tries to use coins that, at that same moment, are being marked as spent by txdb.insert(). Maybe LevelDB is preventing that from happening concurrently. In which case it might make sense that we see this behavior around locked coins instead (which is just a BufferSet in memory, no special handling or wrappers).

I wonder if @nodar-chkuaselidze has any insight here... could be the solution is to add an extra lock to fund():

diff --git a/lib/wallet/wallet.js b/lib/wallet/wallet.js
index c4a4c197..bfbaeed9 100644
--- a/lib/wallet/wallet.js
+++ b/lib/wallet/wallet.js
@@ -1091,11 +1091,13 @@ class Wallet extends EventEmitter {
    */

   async fund(mtx, options, force) {
-    const unlock = await this.fundLock.lock(force);
+    const unlock1 = await this.fundLock.lock(force);
+    const unlock2 = await this.writeLock.lock();
     try {
       return await this._fund(mtx, options);
     } finally {
-      unlock();
+      unlock2();
+      unlock1();
     }
   }
christsim commented 4 years ago

I'd be happy to test this in our environment. Would it be possible to create a patch for me so I can apply it to the latest master branch, that is if you are not willing to merge this change to the master branch? We've seen around 2-3 instances of this issue a week lately, so if this works, we should know after a couple of weeks.

pinheadmz commented 4 years ago

You should be able to apply that diff directly with git apply but please, testnet only, regtest more ideally, the patch is 0% tested or reviewed :-)

christsim commented 4 years ago

Ya, will do it on testnet for a while first.

pinheadmz commented 4 years ago

There is another possibility here, related if not your exact case. I encountered this sequence in a Bitcoin Core test (paraphrasing)

utxos = self.nodes[0].listunspent()
for i in range(100):
    # create a rawtx that spends a random utxo in `utxos`
    self.nodes[0].sendrawtransaction(rawtx)
    utxos = self.nodes[0].listunspent()

Because it is not sending through the wallet, the rawtx doesn't get added to the wallet (and therefore all its coins marked as spent) until a few ticks after listunspent is called again. The result is that listunspent is unaffected by the sendrawtransaction, where the expected behavior (and that apparently of Bitcoin Core) is that the listunspent would already be updated to exclude the coins that were just spent...

christsim commented 4 years ago

So we are calling sendrawtransaction on the node, as there isn't an equivalent call on the wallet to send already signed transactions. This issue is only happening when we call broadcast and then in the next few milliseconds call createTransaction.

Are there any logs in the wallet we can look at to confirm this?

pinheadmz commented 4 years ago

If you're running the wallet as a plugin, its log output will be inline with the full node log (default ~/.bcoin/debug.log) and labeled with the wallet or wallet-http context.

Ok so your issue is with createTransaction which has the advantage that it can be fixed with the extra lock we added in #914

The extra case I found is when you aren't even using the wallet to create the transaction, but still use the wallet to get available UTXOs. If you were to run the above code snippet against bcoin, you'd need a pause or wait-for-event between sendrawtransaction and listunspent, otherwise you could end up with a double spend

christsim commented 4 years ago

Ok, cool. Hopefully #914 will sort it out. We are going to deploy it this weekend. Will report back.