Closed joshpainter closed 3 months ago
Actually I think we might be able to fix this whole mess, including retroactively for any users suffering from this now, with This One Simple Fix (bolded):
await conn.execute("DELETE FROM transaction_record WHERE confirmed=0 AND (wallet_id=? OR wallet_id=0)", (wallet_id,))
@joshpainter thanks for reporting and well done doing a deep dive into the wallet code. Indeed, what you propose with set_status
is roughly what we're currently working on. Hopefully can resolve this fairly quickly.
@joshpainter can you try https://github.com/Chia-Network/chia-blockchain/pull/14722 and see if it fixes the problem
@trepca Unfortunately it does not, but it does change things! Now, the offer files stay "pending" forever even after I've accepted them and they show as confirmed transactions. The 'dummy' records I mention above are also still created and left in the transactions table and never cleaned up. Verify with below SQL after accepting and confirming offers:
SELECT * FROM transaction_record WHERE wallet_id = 0
I think we went backwards with this one. 😢
@joshpainter can you try #14722 and see if it fixes the problem
Hey @trepca it looks you fixed these orphaned records in the latest 1.7.1-rc2! I no longer see the orphaned records using the above SQL after accepting offers.
However, now there is another weird problem that might be related to your fix. If I try to create a second offer that uses the same amount of assets as an outstanding pending offer, I get a screen pop-up in the UI:
I have plenty of confirmed balance of those assets broken up into lots of smaller coins so it should be able to choose a different coin than the outstanding pending offer. If I "Proceed" it doesn't seem that the new offer is created. It seems like it is keying off of the outstanding offer amounts instead of the underlying locked coins?
Let me know if I should make a new issue for this. Thanks for your work on this!
@joshpainter interesting, do you see any WARNING or ERROR entries in logs?
@joshpainter Thanks for bringing that up. The dialog is new to 1.7.1, but in the RC2 build it was a bit overzealous in prompting the user to close out possibly "conflicting" offers when in fact there is no conflict. A fix for this has been merged and will be out in the next RC build.
Thank you both, I'm now on 1.7.1 and I'm still seeing quite a bit of strangeness.
I've created a new wallet to take advantage of the new reuse_public_key_for_change
setting. This seems to work well - at least we can remove a large deviation index as the culprit!
However, I still have weird issues when I try to accept multiple offers in the same block. Sometimes one of them will confirm but others will stay "pending" forever (until I cancel them). Other times it won't even let me accept an offer because I have another offer pending with the same assets - it thinks I'm trying to accept a duplicate offer, but in fact they are different offers - the amounts and assets are just the same. Is it using asset ID and asset amount as a unique key for offers somewhere maybe?!?
I no longer see the "dummy" records in the trade_records
table, but I see all the other stuck offers. I've tried to manually update their statuses, or delete them entirely, but that just seems to mess up other stuff.
Are there currently any test cases around accepting multiple offers in the same block, specifically offers with the same asset ID and amounts? I think they would reveal all of these issues pretty quickly. Thanks!
Some more detail that may or may not help in tracking this down:
set_wallet_resync_on_startup
and observed the new setting get applied in config.yaml. I restarted wallet and it definitely seemed like it took longer than normal, but it did not seem to fix the "stuck" offersautomatically_add_unknown_cats
to true and connect_to_unknown_peers
to false.exempt_peer_networks
DOUBLE_SPEND
errors but not much else.set_wallet_resync_on_startup
would do this for me but it doesn't seem to clear offers/txns?Same issues here. Please check your derivation index. I have found that creating offers, even if they are never beeing executed on chain increase the wallets derivation index. The issue of this is that if your wallet is at a high derivation index, eventhough no transactions have been made. As soon as one offer or transaction is beeing executed, this high derivation index is written to the blockchain.
Furthermore, at a couple thousand derivation indexes, the Wallet stats getting slow. Soon later, it starts bugging out with transactions such as you mentioned. At around 100k-200k it becomes dead and the funds have to be recovered with an offline signer.
As long as no transaction has been executed at this high derivation index, you can recover the derivation index by deleting the Wallet database and resyncing it (since no transactions have been made on chain) Set_wallet_resync_on_startup does not help in that case, as long as it keeps the derivation index or if offers have been executed, canceled on chain or you have sent transactions.
Closing - you may add new comments and reopen with additional details as needed
It is believed that the suggestion here likely help resolve this: https://github.com/Chia-Network/chia-blockchain/issues/17721#issuecomment-2023345210
I don't think this is the same issue, as I've already configured the reuse_public_key_for_change
in config.yaml. See quote from earlier in thread:
I've created a new wallet to take advantage of the new reuse_public_key_for_change setting. This seems to work well - at least we can remove a large deviation index as the culprit!
However, I still have weird issues when I try to accept multiple offers in the same block. Sometimes one of them will confirm but others will stay "pending" forever (until I cancel them). Other times it won't even let me accept an offer because I have another offer pending with the same assets - it thinks I'm trying to accept a duplicate offer, but in fact they are different offers - the amounts and assets are just the same. Is it using asset ID and asset amount as a unique key for offers somewhere maybe?!?
I haven't attempted multiple offers of the same amounts in the same block in the latest release but will try and report back.
Summary
I've been hunting this one for months and I think I've finally cracked it - this will be long-winded but hopefully it will help others searching for similar issues.
Wut?
I've noticed for several months now that wallets inexplicably seem to get "slower" over time. The wallets that show this behavior are very active and they may create and/or accept several offers within the same block. They may be contending with other users trying to accept the same offer at the same time, as in the case with mint events. Deviation indexes of 20,000 and above are common for these larger, active wallets.
At first I thought it might be related to the deviation index. It seemed to get slower at accepting offers as the deviation index climbed. But the strange thing is that I could delete the wallet sqlite database file and let it resync, and then the wallet would be nice and fast again, even with a very high deviation index. So this meant that the derivation index itself was not the root cause!
Something must be accruing, or "leaking," over time. Let's take a look through the wallet database! I ran across some odd looking records in the transaction_record table:
What is this?!? Several transaction records with weird blank values,
wallet_id
of 0 (doesn't exist), etc. After lots of spelunking, I found this bit of code:https://github.com/Chia-Network/chia-blockchain/blob/f29eb44ffc14b79b9ad8ee28fb8d9de0c87514ba/chia/wallet/trade_manager.py#L769-L787
When an offer is accepted, this "dummy" txn is created. Once the offer is confirmed, it appears that these dummy txns get cleaned up. Here's the problems:
How does this affect performance?
It appears that these dummy txns are submitted to the full node over and over, resulting in
DOUBLE_SPEND
errors andpre_validate_spendbundle
warnings in the log.As these orphan dummies build up over time, the wallet gets slower and slower at accepting offers and sending txns because everytime it does, it goes through that huge list of stuck orphan dummies and tries to submit them to full node again. In fact, full node will eventually try to ban the local wallet for this nonsense!
How to see if you are affected by this issue?
Run this query in your wallet db:
SELECT * FROM transaction_record WHERE wallet_id = 0
If you have no pending offers or transactions, but you still get results from this query, it means you have orphaned dummies.
How to temporarily fix this issue?
Use this statement to clean them up manually (log out of wallet in UI/CLI first):
DELETE FROM transaction_record WHERE wallet_id = 0
The more dummy records you cleaned up, the faster your wallet should feel after you next start it up!
I'm not sure about a longer term fix or I would have submitted a PR, but I think I'd start by just including the
wallet_id
when making these dummy txns instead of setting it to zero:https://github.com/Chia-Network/chia-blockchain/blob/f29eb44ffc14b79b9ad8ee28fb8d9de0c87514ba/chia/wallet/trade_manager.py#L781
This would mean that they would naturally get cleaned up by
delete_unconfirmed_transactions
:https://github.com/Chia-Network/chia-blockchain/blob/f29eb44ffc14b79b9ad8ee28fb8d9de0c87514ba/chia/wallet/wallet_transaction_store.py#L356
Longer term, maybe adding some logic to
trade_store.set_status
so that when offers are canceled or failed, these dummy txns get cleaned up would be nice: https://github.com/Chia-Network/chia-blockchain/blob/f29eb44ffc14b79b9ad8ee28fb8d9de0c87514ba/chia/wallet/trading/trade_store.py#L177Fin
Anyway, hopefully this detail helps this bug get fixed extra-quick, cause I have a sneaky suspicion that it has been one of the main causes of wallet slowness and
DOUBLE_SPEND
errors in logs!Thank you for attending my New Issue Talk.