novitski / bitcoinj

Automatically exported from code.google.com/p/bitcoinj
Apache License 2.0
0 stars 0 forks source link

Missing transactions because of non-atomic saving #562

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I thought we have filed the issue already, but could not find it any more. This 
is on bitcoinj 0.11.2.

Symptom: The app is torn down immediately after receiving a transaction (via 
block). This causes the saving to not happen, but on next start blockchain 
download continues from a point after the transaction. So the transaction is 
missed.

Note this issue is *not* about why the app is torn down. It can happen any time 
with Android, and any other device that needs electricity.

Original issue reported on code.google.com by andreas....@gmail.com on 6 Jun 2014 at 5:07

GoogleCodeExporter commented 9 years ago
See attached log. Here is the relevant snippet. Note the "wallet/chain height 
mismatch". Also, the user has a lot of failed wallet saves in his directory 
(usually 0 bytes).

15:12:58.466 [NioClientManager] Wallet - Received tx for 0.011819 BTC: 
8d9ce3a222f9403750311348d84bcdb0740b52866871e56fa26cf7227ced0e1f [0] in block 
000000000000000016bc86ba1b0f4058a154d1f8f99909e5434012154164911d
15:12:58.476 [NioClientManager] Wallet -   tx 
8d9ce3a222f9403750311348d84bcdb0740b52866871e56fa26cf7227ced0e1f ->unspent
15:13:03.581 [NioClientManager] Wallet - Balance is now: 4.946747
15:13:10.562 [main] WalletApplication - configuration: prod, 
org.bitcoin.production
15:13:14.798 [main] WalletApplication - wallet loaded from: 
'/data/data/de.schildbach.wallet/files/wallet-protobuf', took 4232ms
15:13:19.775 [main] Configuration - just being used - last used 8 minutes ago
15:13:20.578 [Thread-12575] AbstractWalletActivity - according to 
"http://wallet.schildbach.de/version?package=de.schildbach.wallet¤t=170", 
strongly recommended minimum app version is 165
15:13:20.596 [ModernAsyncTask #2] ExchangeRatesProvider - fetched exchange 
rates from https://api.bitcoinaverage.com/custom/abw (gzip), 4461 chars, took 
678 ms
15:13:24.127 [main] BlockchainServiceImpl - using 
com.google.bitcoin.store.SPVBlockStore
15:13:24.131 [main] AbstractBlockChain - chain head is at height 304323:
v2 block: 
   previous block: 00000000000000003cc9c5ef71f355218bae0c72a369cf351c9bf312bbecf79c
   merkle root: 8d46f01c787974bba0bce2fda08d2ae3bdf0de2881f570b96aeebf3fd69b95b7
   time: [1401955511] Thu Jun 05 03:05:11 CDT 2014
   difficulty target (nBits): 409544770
   nonce: 1586827106

15:13:24.135 [main] AbstractBlockChain - Wallet/chain height mismatch: 304322 
vs 304323
15:13:24.137 [main] AbstractBlockChain - Hashes: 
00000000000000003cc9c5ef71f355218bae0c72a369cf351c9bf312bbecf79c vs 
000000000000000016bc86ba1b0f4058a154d1f8f99909e5434012154164911d
15:13:24.146 [main] BlockchainServiceImpl - service start command: Intent { 
act=de.schildbach.wallet.service.cancel_coins_received 
cmp=de.schildbach.wallet/.service.BlockchainServiceImpl }
15:13:24.301 [main] BlockchainServiceImpl - network is up
15:13:24.303 [main] BlockchainServiceImpl - wallet/blockchain out of sync: 
304322/304323
15:13:24.307 [main] BlockchainServiceImpl - starting peergroup

Original comment by andreas....@gmail.com on 6 Jun 2014 at 5:09

Attachments:

GoogleCodeExporter commented 9 years ago
I think I suggested it several times, but for the sake of completeness I'll add 
my thought:

I wonder in case of that mismatch why do we continue from the higher height of 
the two? If we'd just use the lower that might be an easy fix. Of course it 
would mean the higher block will go into the blockstore a second time. But I 
think that be easily prevented. If its a problem at all -- somehow it must 
already deal with reorgs and that's a similar case.

Original comment by andreas....@gmail.com on 6 Jun 2014 at 5:14

GoogleCodeExporter commented 9 years ago
If the user has lots of zero sized wallet files, that implies that saving is 
failing repeatedly for some reason. Even if we had replaying from the right 
height done properly, I think this user would still have problems.

Original comment by mh.in.en...@gmail.com on 10 Jun 2014 at 9:52

GoogleCodeExporter commented 9 years ago
Maybe in some cases, but I'm sure there a lots of cases that happen only once 
or twice.

Unfortunately, when receiving coins a lot of things are suddenly going on 
(notification, sound system startup, uncommon code paths used, etc.). So I'd 
say the changes of missing transactions due to this are higher than you'd 
expect.

Original comment by andreas....@gmail.com on 10 Jun 2014 at 10:25

GoogleCodeExporter commented 9 years ago
This guy in particular must have an enormous wallet, I'm seeing "Save completed 
in 11466msec" at the top of the log. 11 seconds?! I bet this is a guy who is 
mining straight into his wallet.

Given that, I'd imagine the failed saves are the result of OOM errors. 
Replaying would not resolve the underlying issue here, good though it'd be to 
have it.

Original comment by mh.in.en...@gmail.com on 10 Jun 2014 at 11:19

GoogleCodeExporter commented 9 years ago
Transactions: 341
Inputs: 402
Outputs: 44719 (spent: 39)

So yes, he's one of the users who likely use the app for mining or microjob 
payouts. A lot of people actually do, we should support these valid usecases. 
And it seems to be easy to quickly accumulate that amount of outputs -- first 
install was only February, so not even half a year.

Btw. he's got an Android 4.4.2, memory class 96 MB device. That's not too bad.

Original comment by andreas....@gmail.com on 10 Jun 2014 at 11:38

GoogleCodeExporter commented 9 years ago
I'd like to add that people stung with this issue who empty their wallet and 
delete it will permanently loose the missing coins. I don't have numbers but I 
suspect it happens even if nobody mails me (usually you won't notice).

So I'd say continuing the download from the right block is more than just "nice 
to have".

Original comment by andreas....@gmail.com on 10 Jun 2014 at 11:47

GoogleCodeExporter commented 9 years ago
This issue of missing trasactions is the cause of the most problematic support 
cases.

Original comment by andreas....@gmail.com on 16 Aug 2014 at 9:20

GoogleCodeExporter commented 9 years ago
Well, it helps me if we distinguish between "transactions went missing for no 
obvious reason" and "transactions went missing because the user ran out of RAM 
and the app exploded". The second case isn't easily fixable: once you're out of 
RAM, we can't do much except detect this and refuse to sync further data, at 
which point transactions are still missing but perhaps the UI is nicer. Future 
work can reduce our RAM requirements to stop this happening so often, but still.

On the other hand, the first case might reflect other kinds of bugs that are 
easier to fix.

Original comment by mh.in.en...@gmail.com on 16 Aug 2014 at 11:47

GoogleCodeExporter commented 9 years ago
Bitcoinj detects this (and other) cases already well (mismatch log message).

The only thing that's missing is code to handle a mismatch well -- rather than 
just continuing with missing blocks (== missing transactions). See my comment 
#2.

Of course its bad that the VM just exits. But that's reality. We cannot do much 
about it.

Original comment by andreas....@gmail.com on 16 Aug 2014 at 11:57

GoogleCodeExporter commented 9 years ago
Lets fix this at least for cases where the wallet block height is lower than 
the block store height. If we continue download from the *lower* number rather 
than the higher, the wallet should not be invalid afterwards and SPV block 
store can probably already deal with receiving a block a second time (reorgs?)

So this looks like a one-line change to me. But I don't understand the 
implications.

Original comment by andreas....@gmail.com on 16 Aug 2014 at 12:02

GoogleCodeExporter commented 9 years ago
Yes we can roll back the block store to the max() of all wallet heights. It 
needs some unit tests and maybe a bit of work inside SPVBlockStore.

Original comment by mh.in.en...@gmail.com on 16 Aug 2014 at 12:12

GoogleCodeExporter commented 9 years ago
If you think its doable that way, I can try my luck at a patch. Just one 
question: How does the block store roll back on a re-org? I think we can use 
the exact same logic. A rollback is just a special case of a reorg (from the 
view of the blockstore).

Original comment by andreas....@gmail.com on 16 Aug 2014 at 12:30

GoogleCodeExporter commented 9 years ago
It doesn't. Blocks on both sides of the fork are stored and skipped over when 
walking the chain backwards. SPVBlockStore stores the best chain height and 
hash. It may be that we can just alter that. It's been years since I wrote that 
class so I forgot exactly what would work.

Original comment by mh.in.en...@gmail.com on 16 Aug 2014 at 1:17

GoogleCodeExporter commented 9 years ago
If course it's much more than a one-line change… (-:

https://github.com/bitcoinj/bitcoinj/pull/176

Original comment by andreas....@gmail.com on 16 Aug 2014 at 6:19