JoinMarket-Org / joinmarket-clientserver

Bitcoin CoinJoin implementation with incentive structure to convince people to take part
GNU General Public License v3.0
725 stars 177 forks source link

Reused cj-outs and missing change outs for unconfirmed transactions #1465

Open freddiemacd opened 1 year ago

freddiemacd commented 1 year ago

I run a maker and noticed that my wallet balance was not correct. At first I thought it might be that I messed up my bitcoin core. I ran -reindex and when I checked the wallet balance it was correct. About an hour later I reran wallet-tool.py and the balance was off again. Because they were run sequentially, I was quickly able to find that two unconfirmed change-outs were the missing coins and found them on a block explorer with the correct balance and still unconfirmed.

I currently have 6 reused addresses in my wallet, most of them but not all currently have the last transaction as unconfirmed.

This leads me to suspect that my node is dropping these unconfirmed transactions. I have restarted my yield generator a number of times in the last few days. Perhaps when I do that, it forgets what transactions are pending when I restart the yield generator and if my mempool has dropped them, then it is not aware and it tries to use the same address for the next conjoin, hence causing address reuse.

Does this sound right?

If so, I expect all my coins to show up properly when they eventually get confirmed but that might be a while since the taker set the fee at 1 sat/vB.

freddiemacd commented 1 year ago

I increased the mempool to 500 and now all the coins show up. I guess these 1 sat/vB transactions got kicked out of my mempool. But, we have had mempool congestion before and I never had reused addresses then.

What are the thoughts on what I should do about these reused addresses? Should I unfreeze them and let the yg use them as usual?

kristapsk commented 1 year ago

I noticed recently different issue caused by low feerate transaction being purged from the mempool, probably it's related.

1) you participate in coinjoin as a maker; 2) check wallet UTXOs with wallet-tool.py, outputs of that transaction are shown as "PENDING", old TXOs are not shown (which is correct); 3) transaction gets purged from local mempool; 4) check wallet with wallet-tool.py again, now neither old TXOs nor new UTXOs are shown and total wallet balance is incorrect, user might think he has lost bitcoins.

freddiemacd commented 1 year ago

I noticed recently different issue caused by low feerate transaction being purged from the mempool, probably it's related.

  1. you participate in coinjoin as a maker;
  2. check wallet UTXOs with wallet-tool.py, outputs of that transaction are shown as "PENDING", old TXOs are not shown (which is correct);
  3. transaction gets purged from local mempool;
  4. check wallet with wallet-tool.py again, now neither old TXOs nor new UTXOs are shown and total wallet balance is incorrect, user might think he has lost bitcoins.

I think that is exactly what is happening here and because they are dropped and JM apparently does not know, it reuses the receive address and if the new join has now paid enough of a fee for it to be confirmed then it ends up finding the unconfirmed previous UTXO and correctly labels it as a reused address.

If that is indeed what is happening, perhaps JM needs to store unconfirmed transactions. I feel like the reason this has happened to me and maybe not others is that I have restarted my maker bot multiple times in the last week. Maybe when the bot is up and running it knows what the unconfirmed transactions are but when the bot is stopped, it does not know. Also to be clear in case it is important, my bot stopped and needed to be restarted multiple times because my bitcoind was stopping which forced the close in JM.

AdamISZ commented 1 year ago

@freddiemacd

I ran -reindex

Are you sure you don't mean rescan or much better rescanblockchain? I don't think reindex is relevant here.

when I restart the yield generator and if my mempool has dropped them, then it is not aware and it tries to use the same address for the next conjoin, hence causing address reuse.

it reuses the receive address

This is the part that is striking to me: the Joinmarket wallet always persists the indices of the addresses that are used, at the exact point when they are first sourced. This is so that even if we have a process crash, we don't reuse addresses.

One way I can see it happening is if the wallet file is deleted, and you recover from seedphrase, and then you don't have a proper rescan of the backend blockchain, to find your in-wallet txes. In this case, the wallet will simply have no information telling it that certain addresses are used, and it will reuse them - but this is unlikely, as it would require the user to ignore that the wallet they've just recovered, is incorrectly showing zero balance.

Can you expand on this at all? I think the bug of 'forgetting' the old txos that actually still exist, when a spending tx gets purged from the mempool, is separate from this.

freddiemacd commented 1 year ago

I'll give some more details about the event even if they may not be relevant in case you think they are relevant.

About 2 weeks ago, I was looking into evaluating utxo data from the chain state. So as not to corrupt it, I was copying the chainstate to another file and running my code on that copied file. However, I was copying without stoping bitcoind. Because I noticed this address reuse problem and the wrong balance, I wondered if I somehow screwed up my chain state since bitcoind was running when I did the copying. I know it did not make sense but I decided to run bitcoind -reindex to be sure there was not a problem with the chainstate. When it completed, I ran wallet-tool.py and all my balances were correct.

Also over this 2 week time frame, I used bitcoin-cli stop a few times while my maker bot was running which of course caused it to crash and stop. That made me wonder if it was involved in the address reuse problem.

About an hour after reindexing and seeing the correct balance, I rechecked wallet-tool.py and it was off again and then I noticed that a couple change out addresses were dropped that explained the discrepancy. I figured that part of the problem could be that my mempool was filled up so I changed my bitcoin.conf mempool up to 500. Once I did that, the wallet balance was now correct.

I never recovered my wallet. I've been using it a long time.

AdamISZ commented 1 year ago

I think it's fair to say that the main thing we're observing is this behaviour:

I myself have recently experienced confusion with a transaction where it's showing up in my wallet in the correct PENDING state (i.e. unconf), but some blockchain explorers don't show the tx at all, while others do. (This was not a coinjoin btw, a direct payment) Until I see it ejected from my local mempool I won't be able to confirm the bug behaviour myself.

@freddiemacd thanks for the detailed explanation. At this point I can only speculate, but if somehow bitcoin core's internal wallet, that we reference in Joinmarket, somehow 'blanked out' address usages (in particular as reported by listaddressgroupings), then maybe we could somehow get an address reuse, but to emphasise again, we really shouldn't, because we increment the index of the HD branch at the point of request of the address, so if, as you say, the joinmarket wallet file is not deleted/changed, that index is persisted (iirc this should still be true even if you do use the --recoversync flag). It seems, there must be some error in that last sentence, but I don't know what.

kristapsk commented 1 year ago

Until I see it ejected from my local mempool I won't be able to confirm the bug behaviour myself.

You could lower the default 300 MB mempool limit and restart bitcoind maybe?

AdamISZ commented 1 year ago

Until I see it ejected from my local mempool I won't be able to confirm the bug behaviour myself.

You could lower the default 300 MB mempool limit and restart bitcoind maybe?

Oh; it never occurred to me to use mainnet! But weirdly it might be better, since generating other txs on regtest would be annoying and probably also on signet. I think you have a point.

AdamISZ commented 1 year ago

Further to that, there is a maxmempool bitcoind setting but I found that it is read as an integer, so the minimum is 1MB, otherwise I would use e.g. regtest and set it to like 1kB and just have the tx kicked out with 1 or 2 others present, or something like that.

The main reason I really need to test like this and not just mock up the scenario, is I don't know how the RPC commands like listtransactions results change in the scenario of 'tx got kicked out of mempool'. That's the main thing I need to know.

kristapsk commented 1 year ago

I noticed recently different issue caused by low feerate transaction being purged from the mempool, probably it's related.

  1. you participate in coinjoin as a maker;
  2. check wallet UTXOs with wallet-tool.py, outputs of that transaction are shown as "PENDING", old TXOs are not shown (which is correct);
  3. transaction gets purged from local mempool;
  4. check wallet with wallet-tool.py again, now neither old TXOs nor new UTXOs are shown and total wallet balance is incorrect, user might think he has lost bitcoins.

Manual bitcoin-cli rescanblockchain solves this issue for me, at least temporary. Also tx that got purged from mempool before is now again in a mempool, but JM didn't detect that automatically, but rescanblockchain somehow solved that.

AdamISZ commented 1 year ago

Taking a step back from trying to construct worthwhile test scenarios (which I'm finding extremely difficult):

See:

https://github.com/bitcoin/bitcoin/blob/369d4c03b7084de967576759545ba36a17fc18bb/src/wallet/rpc/coins.cpp#L641

which calls:

https://github.com/bitcoin/bitcoin/blob/381593c906fb54340dbc2377e2ebc6fb37582d08/src/wallet/spend.cpp#L233-L235

... so listunspent will not return (unconfirmed) utxos that don't exist in-mempool.

So we have a scenario that: listtransactions will not show the spent utxo in the creating transaction as unspent (we have a transaction in-wallet which spends it), while the Joinmarket method sync_unspent (which calls RPC listunspent) will not show the created utxo since it's not in-mempool.

AdamISZ commented 1 year ago

Writing a few more notes to summarize if someone else is trying to understand this:

Joinmarket builds its list of utxos when sync_wallet is called (which calls sync_unspent). This can be called repeatedly in certain cases (depending on script/app running), but it's only called at the start.

After that, the in-memory utxo database (see jmclient.wallet.UTXOManager) is updated with deltas/diffs, only: the core loop of calls to listtransactions inside transaction_monitor in jmclient.wallet_service.WalletService. These notify our BaseWallet object via calls to jmclient.wallet.BaseWallet.process_new_tx which either adds or removes a utxo as appropriate.

So the problem we have here is that, as described in the preceding comment, the sync_unspent part will not see an unconfirmed utxo, if it is not in the mempool. That means that even though listunspent_args has [0] (usually), i.e. it defaults to showing funds as present even if unconfirmed, they will not be shown, if that utxo has been ejected from the mempool. Meanwhile the tx that created that utxo is processed, so the old utxo is (correctly) removed. So you get a missing balance.

(AFAIK, and please correct me if I'm wrong, the bitcoind wallet will still keep, periodically, trying to broadcast the tx, including placing in our own mempool(?) ... I am very unclear about that part).

A simple hacky test confirms what's probably obvious, i.e. if you filter out a utxo (using its amount as a watermark) and prevent listunspent from returning it, then you get the expected behaviour: balance does not show that utxo, and will not until the process is restarted (note the above: sync_unspent is only called at the start, not repeatedly).

So assuming that mental model is correct, what can be done? I think that for any long running script or application, we could at least address it by: having a recurring call to listunspent, so that we re-see utxos that we earlier lost. I don't think it's trivial, but it may be the best choice. Meanwhile, even if that is done correctly, it doesn't entirely solve the issue, as missing balances will still be shown during periods where listunspent doesn't return that utxo.

Finally, I should note that the title of this thread talks about reuse and I must reiterate that that is a far more serious problem but I cannot see how it would occur in relation to this mempool ejection problem, and the original report involves doing other stuff so I cannot say anything other than "that definitely shouldn't happen". It's unfortunate that this thread conflates the two; the "ejection from mempool" issue is serious and I am spending a lot of time trying to find a solution, but "reuse" is an entirely different question.

kristapsk commented 1 year ago

I have different idea, that could also help in context of #1466. How about all peers involved in coinjoin save raw tx in their wallets (.jmdat) until it is either included in block or conflicting transaction is included in block and treat it as pending (unconfirmed) transaction no matter what local Bitcoin Core mempool says?

AdamISZ commented 1 year ago

I have different idea, that could also help in context of https://github.com/JoinMarket-Org/joinmarket-clientserver/issues/1466. How about all peers involved in coinjoin save raw tx in their wallets (.jmdat) until it is either included in block or conflicting transaction is included in block and treat it as pending (unconfirmed) transaction no matter what local Bitcoin Core mempool says?

Yes, it's interesting, I had a think about your suggestion a couple of days ago but I really wasn't too sure.

My feeling is that this isn't quite clear enough to be worth doing. The basic idea is: txid A is known to us, has not yet been confirmed, we persist it. If txid A does not show up after calling sync_unspent we know that the problem exists and can do something.

But now we arrive at a subtlety: is the reason for txid being in this state, temporary? What about edge cases of 1 sat/vbyte; does our node keep trying or does it essentially just abandon that after some time? What about conflicts, i.e. the transaction goes from "dormant" to "conflicted, can never be broadcast" during this time? Can Bitcoin Core report at all that a specific txid that it knows in its own wallet, is in a certain state "this txid will be broadcast until accepted" vs "this txid is never getting broadcast, we are dropping it"? I suspect not (and maybe, it cannot).

(The various tx states in the Core codebase may be relevant: https://github.com/bitcoin/bitcoin/blob/381593c906fb54340dbc2377e2ebc6fb37582d08/src/wallet/transaction.h#L66 )

(also there are constants in the codebase defining how long we keep trying to rebroadcast before giving up, but right now I can't find them).

The central idea is to have the backend Core node be the source of truth about bitcoin transaction state, and the divergence from that is going to be very problematic (maybe only in edge cases, but all the same ... I think the principle probably has to be kept).

harding commented 1 year ago

Are you perhaps encountering https://github.com/bitcoin/bitcoin/issues/11887 ?

Edit: that issue is not clear about all of the impacts, but here's some conversation about it that might clarify:

[6:05:57 pm] <BlueMatt[m]> huh, looks like if a transaction gets removed 
from the mempool bitcoin core just thinks your money disappeared?
getbalance shows a reduction and listunspent shows neither the
removed-from-mempool transaction nor the previous one.
[6:06:10 pm] <BlueMatt[m]> i know I'm supposed to abandon but that's a 
highly surprising and confusing result

[8:02:48 pm] <achow101> BlueMatt[m]: pretty sure there's an ancient open 
issue talking about that, although I can't be bothered to find it right
now

[8:03:33 pm] <BlueMatt[m]> probably, i vaguely recall it, but now that 
fees are high again it smacked me in the face and I went "wow, that's
confusing af" and only didnt panic because I recalled this behavior :p
[8:03:39 pm] <BlueMatt[m]> any other sensible user would have panicked :)

[12:26:56 am] <instagibbs> 
https://github.com/bitcoin/bitcoin/issues/11887 the open issue for it,
though I know it's been around since at least 2013 :P

[12:29:11 am] <willcl_ark> There was also this issue which appeared to 
be user tx dropping out of the mempool:
https://github.com/bitcoin/bitcoin/issues/21768
PulpCattel commented 1 year ago

@harding Thanks for chiming in!

Yes, from my high level overview that's the case. We've indeed seen the recent, brief discussion in the IRC log you mention, and there have been some more discussion in the JM community telegram chat.

Some users have reported that increasing mempool max size indeed fixed the balance mismatch for them, and at least one other has reported that the "abandontransaction" command also worked. I imagine this last option is the preferred one? Are there other relevant recommendation we could give users?

harding commented 1 year ago

abandontransaction is the preferred way to deal with the problem. Larger mempools do avoid the situation but at the cost of using more of the user's resources, so that's not ideal.

I would strongly recommending adding a comment to the above-linked issue to describe how the problem is affecting your users so that Bitcoin Core developers are aware that it's not just direct wallet users that are being affected.

AdamISZ commented 1 year ago

I would strongly recommending adding a comment to the above-linked issue to describe how the problem is affecting your users so that Bitcoin Core developers are aware that it's not just direct wallet users that are being affected.

Good suggestion, done.