bitcoinj / bitcoinj

A library for working with Bitcoin
https://bitcoinj.org
Apache License 2.0
4.95k stars 2.47k forks source link

P2WPKH bug with bloom filter update #2070

Closed oscarguindzberg closed 3 years ago

oscarguindzberg commented 3 years ago

P2WPKH wallets seem to have a problem updating bitcoin core bloom filter: filter is not updated when it should. P2PKH wallets work fine.

See https://github.com/oscarguindzberg/bitcoinj/commit/192571be6136fc6175575318836ea5fa838f8d7f The P2PKH test passes but the P2WPKH test fails. The test creates a wallet, sets lookahead to 2, receives a tx and sends 9 txs from the wallet. There seems to be a race condition on the test, so it is a bit shaky. Try adjusting Thread.sleep() parameter if you don't get the result expressed above.

Let's focus on the P2WPKH case...

When a tx is added to the wallet in the test, a filter recalculation is triggered. During debugging I noted PeerGroup's inFlightRecalculations had 2 elements, one for FilterRecalculateMode.SEND_IF_CHANGED and the other for FilterRecalculateMode.DONT_SEND. I think the problem is FilterRecalculateMode.DONT_SEND's bloomFilterMerger.calculate() finishes first, updates the bloom filter but does not sends the new filter to bitcoin core. When FilterRecalculateMode.DONT_SEND's bloomFilterMerger.calculate() is invoked the filter is already updated so result.changed is set to false and no filter update is sent to bitcoin core either.

I think this line of code might cause the bug: https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/core/PeerGroup.java#L207

Context: I am chasing a bug in a system that uses a forked bitcoinj based on v0.15.8. "Sometimes" the wallet is not notified that a tx spending funds from the wallet was included in a block. I could not isolate/reproduce the bug yet. In the process, I could isolate the bug described above. I am still not sure whether both bugs are related.

oscarguindzberg commented 3 years ago

I think the dropPeersAfterBroadcast feature made the test unstable. I just disabled the feature with this commit: https://github.com/oscarguindzberg/bitcoinj/commit/fde1895c54c394b5ffc4c44982888d81771e3fdb

schildbach commented 3 years ago

Thanks for investigating into this! Discovering that trace of a race condition is "good news" to me.

I wonder if we really need the DONT_SEND update mode. The JavaDoc says: "In case (2), we don't and shouldn't, we should just recalculate and cache the new filter for next time." but fails to explain why we shouldn't. I assume this is for privacy reasons – update the filter as rarely as possible? Are there other reasons? Could you try your test with DONT_SEND changed to SEND_IF_CHANGED? (We could perhaps also try to serialize filter calculations, but this seems like an intrusive change.)

Regarding the dropPeersAfterBroadcast, have you not been able to switch the boolean?

oscarguindzberg commented 3 years ago

If I modify https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/core/PeerGroup.java#L207 to use SEND_IF_CHANGED my test passes. That change breaks PeerGroupTest.testBloomOnP2PK(). The test mentions bug 513 which in fact is https://github.com/bitcoinj/bitcoinj/issues/879

To disable dropPeersAfterBroadcast I have to comment out the code because Wallet.sendCoins() uses broadcaster.broadcastTransaction(tx) which sets dropPeersAfterBroadcast to true.

oscarguindzberg commented 3 years ago

Posting here links to related commits and issues: https://github.com/bitcoinj/bitcoinj/commit/2288c2150f0a8d15ea9be072c6da3265677a99d2 https://github.com/bitcoinj/bitcoinj/issues/1690 https://github.com/bitcoinj/bitcoinj/commit/b8a197642238af41ef853ef35e2744b36e33f6e2 https://github.com/bitcoinj/bitcoinj/commit/fff5af29ff8666e939e72ed68ce7b23afca2c22f

oscarguindzberg commented 3 years ago

@chimp1984 Even if you use legacy addresses to receive funds in Bisq version 1.4.2 you still use segwit addresses for change. I guess you can reproduce the problem because you are spending those change segwit outputs.

chimp1984 commented 3 years ago

@chimp1984 Even if you use legacy addresses to receive funds in Bisq version 1.4.2 you still use segwit addresses for change. I guess you can reproduce the problem because you are spending those change segwit outputs.

Yes, as you pointed that out to me in keybase I deleted my post to not pollute that thread here... Now after a reply I think I should not have deleted it as context to your reply is missing... ;-)

schildbach commented 3 years ago

I went through this issue again and all the related links. Still I could not find a reason for why we can't use SEND_IF_CHANGED all the time. The failing testBloomOnP2PK() can surely be adapted.

My thinking is: An updated filter can only contain more items than before. Worst thing that can happen is a false positive rate greater than what is desirable.

What do you think?

oscarguindzberg commented 3 years ago

I am not really an expert on bloom filters. I would trust your criteria here.

oscarguindzberg commented 3 years ago

@schildbach I found something.

Given this scenario:

A wallet contains 2 txs:

Both txs are already included in blocks.

bitcoinj is started using recover from seed words. When the block containing tx1 is received, Wallet.queueOnCoinsReceived() will be invoked https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/wallet/Wallet.java#L2220 . Then this line will be invoked https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/core/PeerGroup.java#L205 When the block containing tx2 is received, Wallet.queueOnCoinsReceived() won't be invoked because the tx is spending from the wallet. See https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/wallet/Wallet.java#L2222 . So https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/core/PeerGroup.java#L205 won't be invoked.

Summary: the bloom filter won't be updated to include tx2 change outpoint.

So...

why https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/wallet/Wallet.java#L2220-L2222 calls either queueOnCoinsReceived() or queueOnCoinsSent() while https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/wallet/Wallet.java#L2684-L2689 may call both of them?

Bugfix option 1:

Bugfix option 2:

schildbach commented 3 years ago

Great catch! The mismatch in sent/received logic seems like a bug to me.

I think the way to fix this depends on how we want WalletCoinsReceivedEventListener to behave. Its documentation seems to suggest that change shouldn't trigger a "received event" – it's meant for "really received coins". However the lines you discovered (https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/wallet/Wallet.java#L2684-L2689) seem to violate this. So your option 2 should imho include fixing the violation in the first place.

And consequently, your option 1 should include adapting the listener documentation to the new behaviour, and for the clients to deal with the changed behaviour. Due to the logic at https://github.com/bitcoinj/bitcoinj/blob/master/core/src/main/java/org/bitcoinj/wallet/Wallet.java#L2684-L2689, I suspect clients have implemented a guard anyway (Bitcoin Wallet already checks for amount.isPositive() before notifying to the user, so probably no change needed).

schildbach commented 3 years ago

Regarding my preference, I'd say option 2 is less risk, option 1 (if it fixes the logic mismatch) would be more correct.

oscarguindzberg commented 3 years ago

@schildbach

I think I found a related bug:

Given this scenario:

A wallet contains 3 P2WPKH txs:

tx1 sends funds to the wallet tx2 spends tx1 and sends change back to the wallet. tx3 spends tx2 change ouput and sends NO change back to the wallet.

tx1 and tx2 are confirmed tx3 is pending.

I assume at this point Wallet.myUnspents set is empty.

If the bloom filter is regenerated Wallet.isTxOutputBloomFilterable() will return false for every output because myUnspents.contains(out) will always return false.

So, tx2 change output won't be included in the filter.

When tx3 is included in a block, bitcoin core won't include it in the filtered block it sends to bitcoinj, because it does not match the filter.

Then, bitcoinj will consider tx3 pending when it is confirmed.

A solution for this would be to fix Wallet.isTxOutputBloomFilterable(): return (isScriptTypeSupported && out.isMine(this)) || watchedScripts.contains(script);

I see a problem with this solution for wallets having lots of txs: Very old outpoints that were spent a long time ago are going to be part of the filter forever. I don't know if this a big problem since the same already happens with very old keys.

oscarguindzberg commented 3 years ago

btw, do you know why bitcoin-core does not check the witness field to see if it matches the filter?

schildbach commented 3 years ago

btw, do you know why bitcoin-core does not check the witness field to see if it matches the filter?

The short answer: Nobody cared for bloom filters when SegWit was developed.

The long answer: I suspect since witnesses are meant to be pruned from the blockchain at some point, it would not be a good idea to include them. Pruned nodes could then not match against a filter any more. I think just including the script to be spent (for outputs) and the script that was spent (for inputs), and nothing else (no pubkeys, no outpoints) is a much cleaner way to make bloom filters fit for future (not just SegWit).

schildbach commented 3 years ago

tx1 sends funds to the wallet tx2 spends tx1 and sends change back to the wallet. tx3 spends tx2 change ouput and sends NO change back to the wallet. […]

Does this happen in reality?

I think keeping old outpoints in the filter is not that much of a problem (it will get dirty sooner though). However, I fear TransactionOutput.isMine() isn't very performant.

schildbach commented 3 years ago

…then again, bloom filter recalculations should be kind of rare.

oscarguindzberg commented 3 years ago

tx1 sends funds to the wallet tx2 spends tx1 and sends change back to the wallet. tx3 spends tx2 change ouput and sends NO change back to the wallet. […]

Does this happen in reality?

Bisq has a use case where it picks an output from the wallet and spends it to an address not in the wallet. There were lots of "confirmed tx shown as pending" in bisq since it was migrated to segwit. I think the root cause is this bug. SendRequest.emptyWallet() would be another use case.

I think keeping old outpoints in the filter is not that much of a problem (it will get dirty sooner though). However, I fear TransactionOutput.isMine() isn't very performant.

Is there a better alternative?

schildbach commented 3 years ago

Perhaps more caching, like the myUnspents is kind of a cache? But we can leave that for later, I just wanted to point out that there is a risk of performance regression.

oscarguindzberg commented 3 years ago

…then again, bloom filter recalculations should be kind of rare.

Given WalletCoinsReceivedEventListener in PeerGroup it is going to be recalculated every time a tx is received. If restore from seed is used for a wallet with lots of txs, then TransactionOutput.isMine() will be called a lot in a short period of time. I don't know if the impact on performance is going to be huge or insignificant.

schildbach commented 3 years ago

Ok, let's see. The current myUnspent.contains() is a simple hash lookup. isMine() descends into all of the keychains, and also performs a hash lookup on them (LinkedHashMap.get()). At the end of the day, perhaps not that much different.

The replay case should be rare, you need it only if you've lost your phone, or to recover from a bug, like the one discussed here. How many transactions do your user's wallets typically contain?

oscarguindzberg commented 3 years ago

How many transactions do your user's wallets typically contain? @chimp1984 ?

schildbach commented 3 years ago

I'm not sure, but theoretically it should be possible to not always re-create the bloom filter from scratch, but instead slowly append to it. Then we could probably also get rid of the concurrency – just add to the filter on the same thread that adds data to the wallet.

Maybe it was not done this way because then the filter becomes dirty much sooner? I'm no expert on bloom filters either, which is why I never tried to tackle this.

oscarguindzberg commented 3 years ago

I will apply the bugfixes discussed above on bisq's bitcoinj fork. Once it is tested in production we can resume this conversation.

chimp1984 commented 3 years ago

How many transactions do your user's wallets typically contain? @chimp1984 ?

Very hard to say....but power users who have used the app for more than a year might have quite a lot. At least old heavy wallets have impact on overall performance as far observed.

schildbach commented 3 years ago

@oscarguindzberg I'm curious if you applied the fixes and if there's already a visible change?

oscarguindzberg commented 3 years ago

I applied these changes on a bitcoinj fork: https://github.com/bisq-network/bitcoinj/commit/1a786a2d1aaf18a19f11032bde9a6b6382b1a18c https://github.com/bisq-network/bitcoinj/commit/bc8d2e791aa073856d83cbac16d4e8f3f7b7f71a There were lots of reports of "tx is confirmed but I see it as not confirmed in the app". After the fix, there were no new reports of that bug. I will create specific issues/PRs when I find some spare time.

schildbach commented 3 years ago

If you don't mind I will start merging your changes? Or do you want to send a PR?

oscarguindzberg commented 3 years ago

sorry for the delay. yes, please, go ahead and merge the changes.

schildbach commented 3 years ago

Ok, merged as 78551cbad2cc11b0c1f84248f5cffe71e50a3fe6 and d32dbf8fb0e111322a09d7f49072f7ba0aed69cb. I'll backport them to 0.15 soon.

schildbach commented 3 years ago

Backported to 0.15.10, just released.

I also did quite a lot of manual testing, while I was still able to reproduce this issue on 0.15.9 it didn't occur on 0.15.10 any more.

Thanks again Oscar! I'm very happy that this bug finally seems to be at rest.