Closed GoogleCodeExporter closed 9 years ago
Looks like it is damaged. I just got this exception which I never got:
E/AndroidRuntime(20011): FATAL EXCEPTION: PeerGroup-4-thread-1
E/AndroidRuntime(20011): java.lang.NullPointerException
E/AndroidRuntime(20011): at
com.google.bitcoin.core.BlockChain.findSplit(BlockChain.java:322)
E/AndroidRuntime(20011): at
com.google.bitcoin.core.BlockChain.connectBlock(BlockChain.java:238)
E/AndroidRuntime(20011): at
com.google.bitcoin.core.BlockChain.add(BlockChain.java:210)
E/AndroidRuntime(20011): at
com.google.bitcoin.core.BlockChain.add(BlockChain.java:135)
E/AndroidRuntime(20011): at
com.google.bitcoin.core.Peer.processBlock(Peer.java:336)
E/AndroidRuntime(20011): at com.google.bitcoin.core.Peer.run(Peer.java:201)
E/AndroidRuntime(20011): at
com.google.bitcoin.core.PeerGroup$4.run(PeerGroup.java:547)
E/AndroidRuntime(20011): at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076)
E/AndroidRuntime(20011): at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569)
E/AndroidRuntime(20011): at java.lang.Thread.run(Thread.java:856)
Original comment by andreas....@gmail.com
on 21 Mar 2012 at 8:01
As to why the download peer did not die - are you sure PeerGroup.stop() was
called?
As to the corruption - could it be that two BlockChain objects were created by
the app, both referring to the same underlying BlockStore?
BlockChain.add() is synchronized, so having two download peers should not cause
corruption.
Original comment by mi...@google.com
on 21 Mar 2012 at 9:59
I'm pretty sure stop() was called.
The two BlockChain objects are very well possible, and I cannot prevent it -
since PeerGroup.stop() is not synchronous.
Original comment by andreas....@gmail.com
on 21 Mar 2012 at 10:06
BlockChain objects are created by you, the app owner. :)
So you just have to make sure that you only create one per BlockStore.
Original comment by mi...@google.com
on 21 Mar 2012 at 10:10
Each instance of the Apps Service creates exactly one BlockStore, one
BlockChain and one PeerGroup. But I cannot know how long PeerGroup is still
using its BlockChain/BlockStore because PeerGroup.stop() is not synchronized.
See what I mean?
So if stop() takes 30 minutes (which it did in the above case), chances are
very high that the user restarts the app in this timeframe and thus second
instances of PeerGroup/BlockChain/BlockStore might be created (I'm not sure).
If this is the case, AFAIK I cannot do anything about it.
Making PeerGroup.stop() synchronized is not a solution, because that would
block the main thread (for up to 30 minutes...)
IMHO we really should make sure Peer threads die very soon after stop() is
called - but without leaving BlockStores or Wallets in an inconsistent state.
Perhaps my SQLiteBlockStore would help because databases are typically designed
for concurrent access from different processes.
Original comment by andreas....@gmail.com
on 21 Mar 2012 at 11:08
If the app is really restarted (the old app instance is killed), then the VM is
destroyed.
PeerGroup.stop() should be fast, because it closes all the peer sockets.
Unless of course there is a bug.
After you call PeerGroup.stop(), do you create a new PeerGroup? Maybe I could
look at that part of your code.
Original comment by mi...@google.com
on 21 Mar 2012 at 11:49
If one Thread is busy backscanning the block chain (which can take about 20
seconds), PeerGroup.stop() isn't fast but wait for the thread to complete.
Original comment by andreas....@gmail.com
on 22 Mar 2012 at 8:06
This issue was closed by revision 92398d2c4770.
Original comment by mi...@google.com
on 26 Mar 2012 at 7:18
I'm not confident that this resolves the issue, but lets see.
Original comment by mi...@google.com
on 26 Mar 2012 at 7:19
This issue was closed by revision 9474eaa0d4dc.
Original comment by mi...@google.com
on 26 Mar 2012 at 10:24
[deleted comment]
This issue was closed by revision 1e52a6eccce0.
Original comment by mi...@google.com
on 26 Mar 2012 at 11:16
Another try - this should work better on sockets with a connect in progress.
Original comment by mi...@google.com
on 26 Mar 2012 at 11:18
I did some println testing. Usually I start the app, wait until some
"Connecting to peer" messages appear and then stop the app (which calls
PeerGroup.close())
Even with the latest version, Timeouts talking to Peer do not time out until
60s after.
Some more things:
Peer.disconnect and TCPNetworkConnection.shutdown() seem to called only for
Peers that are already connected - not for Peers that are still waiting to be
connected.
On one Peer, I am observing a disconnect and shutdown but the shutdown seems to
hang indifinitely at socket.shutdownOutput().
And even more: After shutting down, I still see the occasional Connecting to
Peer message.
I think I mentioned this some a year ago: Did you already look at Apache Mina?
They should have solved the same problems, and when I used it 3 years ago for a
low-level-socket protocol, it worked quite flawlessly.
Original comment by andreas....@gmail.com
on 27 Mar 2012 at 9:19
Yeah, this is getting more and more complicated. Definitely worth checking out
a library. Or just switching fully to NIO.
Original comment by hearn@google.com
on 27 Mar 2012 at 9:45
The latest patch appears to resolve the disconnect/shutdown methods not called
for all peers.
However, I still get occasional connects after the PeerGroup is well into its
shutdown process. And I still get Timeout talking to Peer after 60s.
Original comment by andreas....@gmail.com
on 27 Mar 2012 at 4:01
Another try: revision 3bc999a032b7d7cae5b285ddd36868d2ad453459.
Original comment by mi...@google.com
on 27 Mar 2012 at 5:30
The timeout after 60s is still there. For each timeout there is now an
additional disconnect/shutdown after the timeout occured.
Original comment by andreas....@gmail.com
on 27 Mar 2012 at 11:35
Out of ideas - would need to use a debugger to figure it out.
I started work on an NIO solution, but this will take awhile.
Original comment by mi...@google.com
on 28 Mar 2012 at 3:24
I think this is fixed now, we did a lot of improvements in this area and I
didn't see it happen for a long time.
Original comment by hearn@google.com
on 14 Dec 2012 at 4:07
Original issue reported on code.google.com by
andreas....@gmail.com
on 21 Mar 2012 at 7:59