Closed skaht closed 8 years ago
Confirmed on GNU/Linux GCC.
The merkle root is generated in libbitcoin
and tested in libbitcoin-blockchain
. Which branch/commit are these built from?
I used and tested master:
libbitcoin: 4d588d4874e14cdd03e5595ba28cd139f25a2fc1
libbitcoin-blockchain: 0dc11e205c7eb7b9ffd24fd78af00ed41dff08e0
In short, my testnet server completed a sync starting from block 446953 until it reached block 508081(last good block height). Now on startup it repeatedly shows:
ERROR [poller] Error storing block [000000000007cb8945a2320c9d4c799eb5ac6fad78c3bd7f318a81580372c35c] merkle root mismatch
Okay, I don't see any obvious causes or explanation for why there would be an issue with that block after over 500k good checks. I'm in the process of working through the network stack, so may take me some time to get a repro on this.
Agreed. This particular block has 73 txs in it and on inspection, they do appear to differ from what I see was accepted via a testnet explorer. It's almost as if some bad blocks are being propagated and we can't move past them to get the proper/good ones. Just a theory, looking into off and on as I can.
In this case we should eventually see the valid block and easily move past it. The only way this might not happen, assuming we are getting the valid data, is that the orphan pool size is too small to reorg onto a stronger fork containing the valid blocks. But this implies you have build up a fairly long weaker fork above the fork point. Seems unlikely.
Also agreed. Not sure why else this is the case though. We appear to be receiving multiple copies of this block all with the same header merkle, and the same tx list which doesn't match the block explorer list -- and so the mismatch on computed merkle root. I'm at a bit of a loss at the moment. I was hoping it was a bug in the computation, but it's straight-forward and looks correct on review, which was why I checked each tx hash to verify inputs were the same and found the mismatch. Any other ideas you can think of?
A regression in our tx hash generation seems like the next place to look.
Can recreate issue very quickly for Block 508082. (Have an un-corrupted testnet blockchain starting at Block 507972. ) Upping the following in the config file makes no difference for the latest master build from yesterday.
network.host_pool_capacity = 5000
node.transaction_pool_capacity = 8000
If there was a pool size issue it would show in the logs, but if you want to expand the block orphan pool you need this:
[blockchain]
block_pool_capacity = 100
For:
blockchain.block_pool_capacity = 500
Still run into this:
14:39:29.315334 INFO [poller] Block #508078 0000000002c0a03bb4d0e86fde3a970278457f5e1947ed32f72d94b28d3c14e2
14:39:29.321844 INFO [poller] Block #508079 0000000000000b7cd4cbeb97b3d4c8684f4a2f2e969dc8fa69746ac33ce0e72c
14:39:29.334016 ERROR [poller] Error storing block [000000000195af8dfc54dfd24e5370e030c6c51bc8837d69cc50bd7c47abcb1e] previous block failed to validate
14:39:29.340032 ERROR [poller] Error storing block [000000000007cb8945a2320c9d4c799eb5ac6fad78c3bd7f318a81580372c35c] merkle root mismatch
14:39:30.617506 ERROR [poller] Error storing block [000000000007cb8945a2320c9d4c799eb5ac6fad78c3bd7f318a81580372c35c] merkle root mismatch
mainnet reproduces at block 32652 (master)
00:16:38.747398 INFO [poller] Block #322541 000000000000000004f16f65eb97f6d1a5269be2b06a53a8cf2945af6baddbad
00:16:40.530504 INFO [poller] Block #322554 00000000000000001d9b4a7ec4e0ee92843614f6f79ce9a5bd168c17f0cd74a3
00:16:41.084537 INFO [poller] Block #322560 000000000000000002df2dd9d4fe0578392e519610e341dd09025469f101cfa1
00:16:50.827114 INFO [poller] Block #322651 000000000000000003254d09cce46cb28442f36e331b43b6008208686930ec13
00:16:54.025303 ERROR [poller] Error storing block [00000000000000000c601eb4dcbe3870de246ba8f352e044b690cb53d9479067] from [[2001:4800:7819:104:be76:4eff:fe05:c9a0]:8333] previous block failed to validate
00:16:59.979656 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [[2400:8900::f03c:91ff:fe6e:823e]:8333] merkle root mismatch
00:17:04.299912 INFO [network] Connected to outbound channel [108.61.190.77:8333]
00:17:05.100959 INFO [network] Connected to outbound channel [65.26.30.171:8333]
00:17:06.483041 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [108.61.190.77:8333] merkle root mismatch
00:17:07.171082 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [65.26.30.171:8333] merkle root mismatch
00:17:12.356389 INFO [network] Connected to outbound channel [51.254.71.147:8333]
00:17:16.267621 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [51.254.71.147:8333] merkle root mismatch
00:17:46.370439 INFO [network] Connected to outbound channel [24.125.54.27:8333]
00:17:46.702464 INFO [network] Connected to outbound channel [83.233.54.154:8221]
00:17:49.743639 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [24.125.54.27:8333] merkle root mismatch
00:17:51.115726 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [83.233.54.154:8221] merkle root mismatch
00:18:08.038723 INFO [network] Connected to outbound channel [94.112.102.36:8333]
00:18:11.313922 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [94.112.102.36:8333] merkle root mismatch
00:18:24.398691 ERROR [poller] Error storing block [00000000000000001db1c8ff51a040d517d88468148f1d07d84e4ec2fbeb1e26] from [[2001:470:1f0b:ad6::2]:8333] merkle root mismatch
https://github.com/libbitcoin/libbitcoin/pull/335 doesn't resolve the above issue on mainnet.
Issue not resolved at this end with a preliminary non-production build with a to fix libbitcoin-node commit cfa23214906172df9bb5f569b0352f1582ea1005 to file responder.cpp documented at https://github.com/libbitcoin/libbitcoin-server/issues/136.
I can test on mainnet -- but just to clarify, you're both testing @thecodefactory/libbitcoin, correct? The PR was not merged before the large merge to master from @evoskuil
If so, might also be a good idea to test master from all of the trees before that large merge in case another regression was added despite that fix.
That thought went through my mind after my last post. Looked at the postings at https://github.com/thecodefactory and noticed they pointed me back to Eric's distribution. Point me to your distributions and I'll give them a test drive.
The code is here: https://github.com/thecodefactory/libbitcoin/tree/bugfix1
But it's not that simple since I rebased all dependent trees against master after the merge from @evoskuil
So in other words, it can't work as intended any longer. I can try to track down what the state of the trees looked like before the merge because I can confirm that worked here. I'd say to hang tight for a minute while I track that down.
Still a github newbie. Was using bb9c7aeacba81e3ed869d89d1a2352b6de57b786 for libbitcoin. Noticed that src/chain/operation.cpp did not match the changes at https://github.com/thecodefactory/libbitcoin/commit/a214f9549b3445d11baa92a951f7d4c1a8f6bafb.
Will apply https://github.com/thecodefactory/libbitcoin/blob/a214f9549b3445d11baa92a951f7d4c1a8f6bafb/src/chain/operation.cpp that bb9c7aeacba81e3ed869d89d1a2352b6de57b786 appears to be missing.
I wouldn't recommend that -- let me find the correct version.
You are correct... The testsuite failed after I updated the operation.cpp file. Greatly appreciate your assistance.
Better yet, I think I see the issue and want to test it against the latest master. There is a bug in the PR, so best it wasn't merged. Will try to rebase and test against current master.
New PR issued: https://github.com/libbitcoin/libbitcoin/pull/337
If you build cleanly using the install.sh from libbitcoin-server/master, just change the line:
build_from_github libbitcoin libbitcoin master $PARALLEL "$@" $BITCOIN_OPTIONS
to:
build_from_github thecodefactory libbitcoin master $PARALLEL "$@" $BITCOIN_OPTIONS
Have a OSX change to libbitcoin-node that must be supported at this end, can't use the install.sh. Performed a
% git clone --branch master --single-branch https://github.com/thecodefactory/libbitcoin
and currently building with custom scripts that I have a reasonable amount faith in.
Ah, ok. I think that should work then as long as the other trees are mostly resembling master across the board.
The other trees match, except libbitcoin-node that has a tweak. However, the Testsuite just failed for libbitcoin. Just built again without the make check
.
Haven't tried the tests since the merge (didn't build with tests enabled).
Dude... It seems to be functioning:-)
22:55:53.743701 INFO [poller] Block #508502 0000000002372bd3e6abb5ceaddf6f9dee7a34ccf6c50a25c3c9b4f740002fa6
Glad to hear!
Your da man!
It was actually a regression that was added (i.e. it appears it used to work). Of course, it took me a long time to find out what it was exactly ... but hey, at least it's isolated now ;-)
CNTL-C also seems to work properly:-) Need to to see if the chain reaches around block 604K overnight.
I recommend adding a checkpoint to your etc/bs.cfg:
checkpoint = 000000000000624f06c69d3a9fe8d25e0a9030569128d63ad1b704bbb3059a16:600000
Will add that checkpoint to my bs-testnet.cfg file after the block height reaches 600K. Any config file tweaks that can accelerate the chain building process that you recommend?
If you add that checkpoint now, it will skip the long validation until it reaches 600K (i.e. speed up the process).
Thought the checkpoints only speed up the server booting process by short-circuiting the integrity validation of the blockchain by establishing jump points for trust.
Already have the following checkpoints added:
checkpoint = 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f:0
checkpoint = 00000000009e2958c15ff9290d571bf9459e93b19765c6801ddeccadbb160a1e:100000
checkpoint = 0000000000287bffd321963ef05feab753ebe274e1d78b2fd4e2bfe9ad3aa6f2:200000
checkpoint = 000000000000226f7618566e70a2b5e020e29579b46743f05348427239bf41a1:300000
checkpoint = 000000000598cbbb1e79057b79eef828c495d4fc31050e6b179c57d07d00367c:400000
checkpoint = 000000000001a7c0aaa2630fbb2c0e476aafffc60f82177375b2aaa22209f606:500000
checkpoint = 0000000000000269b72c62fb7517dd489a3069e63d3d154ed453a9f3664214e4:531960
checkpoint = 000000000000242db10b3171738a11367742bb894559dc420e35921b29cdafa9:579304
Yep, that's correct. If you leave it overnight, yours will do longer validation checks after 579304 (instead of 600k as provided). Either way. It was just a recommendation to speed things up.
Started developing a list for network magic numbers in hex to eventually convert to decimal. Here is what I have so far:
# Magic value, Magic Number, Network ID, P2P_PREFIX
# unsigned char pchMessageStart[4] = { 0xf9, 0xbe, 0xb4, 0xd9 }; // d9b4bef9
# main 0xD9B4BEF9 https://github.com/bitcoin/bitcoin/blob/master/src/chainparams.cpp#L87
# testnet 0xDAB5BFFA https://github.com/bitcoin/bitcoin/blob/master/src/chainparams.cpp#L221
# testnet3 0x0709110B https://github.com/bitcoin/bitcoin/blob/master/src/chainparams.cpp#L158
# LTC mainnet 0xDBB6C0FB https://github.com/litecoin-project/litecoin/blob/master-0.10/src/chainparams.cpp#L117
# LTC testnet 0xDCB7C1FC https://github.com/litecoin-project/litecoin/blob/master-0.10/src/chainparams.cpp#L117
# RDD mainnet 0xDBB6C0FB https://github.com/reddcoin-project/reddcoin-seeder/blob/master/protocol.cpp#L25 unsigned char pchMessageStart[4] = { 0xfb, 0xc0, 0xb6, 0xdb };
# DOGE mainnet 0xC0C0C0C0 https://github.com/dogecoin/dogecoin/blob/master/src/chainparams.cpp#L92
# Dash mainnet 0xBD6B0CBF https://github.com/dashpay/dash/blob/master/src/chainparams.cpp#L122
# PPC mainnet
# NMC mainnet 0xFEB4BEF9 https://github.com/domob1812/namecore/blob/master/src/chainparams.cpp#L117
# FTC mainnet fcd9b7dd http://forum.feathercoin.com/topic/7084/ufo-coin-relaunched-with-some-help-from-bushstar/13
# BLK mainnet 0x05223570 https://github.com/rat4/blackcoin/blob/master/src/chainparams.cpp#L54
Cool. It will be interesting to see how much the protocols differ at the network level (ie excepting block and tx messages).
libbitcoin::network::p2p seeds, connects, accepts and maintains connections using version, address and ping protocols alone. Changing the magic number via config should aow this to work with any Bitcoin-based network.
I'm presently working on libbitcoin-node, injecting block and tx protocols, and implementing an additional session for initial block sync - headers first, of course.
The testnet blockchain finished building at this end. Running the sever behind NAPT firewall without port forwarding enabled.
18:17:01.019045 INFO [poller] Block #604601 0000000000e4a878ebe0e2735c869d2c67f4e03005dfd57bf572ce7d527f40c2
Server seems much more stable. No hanging or crashes to report. Will start examining local bx & bs interactions.
Resolved by PR libbitcoin/libbitcoin#337
For commit d0eff8a6454e270dee97928764d772295b789e71, rebuilding the blockchain from scratch.
Stopping and re-starting bitcoin-server doesn't overcome a merkle root issue. Any means to rollback a local blockchain to an earlier height without having untar an older image of the blockchain directory?
https://sandbox.coinbase.com/network/blocks/000000000007cb8945a2320c9d4c799eb5ac6fad78c3bd7f318a81580372c35c
The problem repeated after rolling back my blockchain directory back around 12K blocks from an older tar imager