ethereum / aleth

Aleth – Ethereum C++ client, tools and libraries
GNU General Public License v3.0
3.97k stars 2.17k forks source link

Assertion failure in TrieDB #4585

Open chfast opened 7 years ago

chfast commented 7 years ago
 ⚡   17:24:10.187|eth  txExec 5865 ms
 ⚡   17:24:10.187|eth  dev::u256 dev::eth::Block::enact(const dev::eth::VerifiedBlockRef&, const dev::eth::BlockChain&) 5865 ms
  ℹ  17:24:10.190|eth  SLOW IMPORT: { "transactions": 87, "blockNumber": 4337888, "gasUsed": 6656766, "gasPerSecond": 1134498, "blockHash": "61861b16…", "writing": 0.000142, "total": 5.86897, "collation": 6.2e-05, "preliminaryChecks": 0.000354, "checkBest": 0.00082, "enactment": 5.86759  }
 ⚡   17:24:10.190|eth  Block import 4337888 5869 ms
eth: /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:923: void dev::GenericTrieDB<_DB>::mergeAtAux(dev::RLPStream&, const dev::RLP&, dev::NibbleSlice, dev::bytesConstRef) [with _DB = dev::OverlayDB; dev::bytesConstRef = dev::vector_ref<const unsigned char>]: Assertion `!r.isNull()' failed.

Thread 14 "eth" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe91b5700 (LWP 11753)]
0x00007ffff6907428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6907428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff690902a in __GI_abort () at abort.c:89
#2  0x00007ffff68ffbd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x152f54a "!r.isNull()", 
    file=file@entry=0x1423c60 "/home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h", line=line@entry=923, 
    function=function@entry=0x1586b40 <dev::GenericTrieDB<dev::OverlayDB>::mergeAtAux(dev::RLPStream&, dev::RLP const&, dev::NibbleSlice, dev::vector_ref<unsigned char const>)::__PRETTY_FUNCTION__> "void dev::GenericTrieDB<_DB>::mergeAtAux(dev::RLPStream&, const dev::RLP&, dev::NibbleSlice, dev::bytesConstRef) [with _DB = dev::OverlayDB; dev::bytesConstRef = dev::vector_ref<const unsigned char>]")
    at assert.c:92
#3  0x00007ffff68ffc82 in __GI___assert_fail (assertion=assertion@entry=0x152f54a "!r.isNull()", 
    file=file@entry=0x1423c60 "/home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h", line=line@entry=923, 
    function=function@entry=0x1586b40 <dev::GenericTrieDB<dev::OverlayDB>::mergeAtAux(dev::RLPStream&, dev::RLP const&, dev::NibbleSlice, dev::vector_ref<unsigned char const>)::__PRETTY_FUNCTION__> "void dev::GenericTrieDB<_DB>::mergeAtAux(dev::RLPStream&, const dev::RLP&, dev::NibbleSlice, dev::bytesConstRef) [with _DB = dev::OverlayDB; dev::bytesConstRef = dev::vector_ref<const unsigned char>]")
    at assert.c:101
#4  0x0000000000527944 in dev::GenericTrieDB<dev::OverlayDB>::mergeAtAux (this=0x7fffe91ae1d0, _out=..., _orig=..., _k=..., _v=...)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:923
#5  0x000000000052834f in dev::GenericTrieDB<dev::OverlayDB>::mergeAt (this=this@entry=0x7fffe91ae1d0, _orig=..., _origHash=..., _k=..., _v=..., _inLine=false)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:905
#6  0x000000000052869a in dev::GenericTrieDB<dev::OverlayDB>::mergeAt (this=this@entry=0x7fffe91ae1d0, _orig=..., _k=..., _v=..., _inLine=<optimized out>)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:834
#7  0x0000000000527888 in dev::GenericTrieDB<dev::OverlayDB>::mergeAtAux (this=0x7fffe91ae1d0, _out=..., _orig=..., _k=..., _v=...)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:926
#8  0x000000000052834f in dev::GenericTrieDB<dev::OverlayDB>::mergeAt (this=this@entry=0x7fffe91ae1d0, _orig=..., _origHash=..., _k=..., _v=..., _inLine=false)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:905
#9  0x000000000052869a in dev::GenericTrieDB<dev::OverlayDB>::mergeAt (this=this@entry=0x7fffe91ae1d0, _orig=..., _k=..., _v=..., _inLine=<optimized out>)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:834
#10 0x0000000000527888 in dev::GenericTrieDB<dev::OverlayDB>::mergeAtAux (this=0x7fffe91ae1d0, _out=..., _orig=..., _k=..., _v=...)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:926
#11 0x000000000052834f in dev::GenericTrieDB<dev::OverlayDB>::mergeAt (this=this@entry=0x7fffe91ae1d0, _orig=..., _origHash=..., _k=..., _v=..., _inLine=false)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:905
#12 0x00000000005284bf in dev::GenericTrieDB<dev::OverlayDB>::insert (this=this@entry=0x7fffe91ae1d0, _key=..., _value=...)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:785
#13 0x000000000059f171 in dev::SpecificTrieDB<dev::GenericTrieDB<dev::OverlayDB>, dev::FixedHash<32u> >::insert (_value=..., _k=..., this=0x7fffe91ae1d0)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:332
#14 dev::FatGenericTrieDB<dev::OverlayDB>::insert (_value=..., _key=..., this=0x7fffe91ae1d0) at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:448
#15 dev::SpecificTrieDB<dev::FatGenericTrieDB<dev::OverlayDB>, dev::FixedHash<32u> >::insert (_value=..., _k=..., this=0x7fffe91ae1d0)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:332
#16 dev::SpecificTrieDB<dev::FatGenericTrieDB<dev::OverlayDB>, dev::FixedHash<32u> >::insert (_value=std::vector of length 10, capacity 10 = {...}, _k=..., this=0x7fffe91ae1d0)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/../libdevcore/TrieDB.h:333
#17 dev::eth::commit<dev::OverlayDB> (_cache=std::unordered_map with 3 elements = {...}, _state=...) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/State.cpp:713
#18 0x0000000000594248 in dev::eth::State::commit (this=this@entry=0x7fffe91b2f10, _commitBehaviour=<optimized out>) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/State.cpp:221
#19 0x0000000000594b66 in dev::eth::State::execute(dev::eth::EnvInfo const&, dev::eth::SealEngineFace const&, dev::eth::Transaction const&, dev::eth::Permanence, std::function<void (unsigned long, unsigned long, dev::eth::Instruction, boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<0u, 0u, (boost::multiprecision::cpp_integer_type)1, (boost::multiprecision::cpp_int_check_type)0, std::allocator<unsigned long long> >, (boost::multiprecision::expression_template_option)1>, boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<0u, 0u, (boost::multiprecision::cpp_integer_type)1, (boost::multiprecision::cpp_int_check_type)0, std::allocator<unsigned long long> >, (boost::multiprecision::expression_template_option)1>, boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<0u, 0u, (boost::multiprecision::cpp_integer_type)1, (boost::multiprecision::cpp_int_check_type)0, std::allocator<unsigned long long> >, (boost::multiprecision::expression_template_option)1>, dev::eth::VM*, dev::eth::ExtVMFace const*)> const&) (this=this@entry=0x7fffe91b2f10, _envInfo=..., _sealEngine=..., _t=..., _p=_p@entry=dev::eth::Permanence::Committed, 
    _onOp=...) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/State.cpp:573
#20 0x00000000004fd95c in dev::eth::Block::execute(dev::eth::LastBlockHashesFace const&, dev::eth::Transaction const&, dev::eth::Permanence, std::function<void (unsigned long, unsigned long, dev::eth::Instruction, boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<0u, 0u, (boost::multiprecision::cpp_integer_type)1, (boost::multiprecision::cpp_int_check_type)0, std::allocator<unsigned long long> >, (boost::multiprecision::expression_template_option)1>, boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<0u, 0u, (boost::multiprecision::cpp_integer_type)1, (boost::multiprecision::cpp_int_check_type)0, std::allocator<unsigned long long> >, (boost::multiprecision::expression_template_option)1>, boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<0u, 0u, (boost::multiprecision::cpp_integer_type)1, (boost::multiprecision::cpp_int_check_type)0, std::allocator<unsigned long long> >, (boost::multiprecision::expression_template_option)1>, dev::eth::VM*, dev::eth::ExtVMFace const*)> const&) (this=this@entry=0x7fffe91b2f10, _lh=..., _t=..., _p=_p@entry=dev::eth::Permanence::Committed, _onOp=...)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/Block.cpp:665
#21 0x00000000004fe49f in dev::eth::Block::enact (this=this@entry=0x7fffe91b2f10, _block=..., _bc=...) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/Block.cpp:496
#22 0x0000000000501a2c in dev::eth::Block::enactOn (this=this@entry=0x7fffe91b2f10, _block=..., _bc=...) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/Block.cpp:450
#23 0x0000000000578b18 in dev::eth::BlockChain::import (this=this@entry=0x2ae3d80, _block=..., _db=..., _mustBeNew=_mustBeNew@entry=true)
    at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/BlockChain.cpp:697
#24 0x000000000057a0c8 in dev::eth::BlockChain::sync (this=0x2ae3d80, _bq=..., _stateDB=..., _max=_max@entry=10) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/BlockChain.cpp:469
#25 0x00000000004ec51f in dev::eth::Client::syncBlockQueue (this=this@entry=0x2ae37f0) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/Client.cpp:382
---Type <return> to continue, or q <return> to quit---
#26 0x00000000004eca66 in dev::eth::Client::doWork (this=0x2ae37f0, _doWait=<optimized out>) at /home/chfast/Projects/ethereum/cpp-ethereum/libethereum/Client.cpp:652
#27 0x000000000067b374 in dev::Worker::workLoop (this=0x2ae3ce8) at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/Worker.cpp:138
#28 0x000000000067b68c in dev::Worker::<lambda()>::operator()(void) const (__closure=0x22afc48) at /home/chfast/Projects/ethereum/cpp-ethereum/libdevcore/Worker.cpp:62
#29 0x00007ffff7278510 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#30 0x00007ffff7bc16ba in start_thread (arg=0x7fffe91b5700) at pthread_create.c:333
#31 0x00007ffff69d93dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
dirtyfilthy commented 6 years ago

This assertion is thrown because as part of the trie insertion process a trie hash reference is reached to a non-existence node. Still trying to figure out how it reaches this point.

gumb0 commented 6 years ago

Hey @dirtyfilthy, thanks for trying to help with this.

As these asserts are not failing again after immediate restart, I suspect that it's not the bug in the trie logic itself, but some data races around accessing the state database. So this is hard to track down, I'm not sure yet how to proceed.

gumb0 commented 6 years ago

MemoryDB & OverlayDB - classes representing state database - even have locks, wich were #ifdefed out at some point. One experiment I have in mind is to try to enable them back and see if it helps with these asserts.

gumb0 commented 6 years ago

One experiment I have in mind is to try to enable them back and see if it helps with these asserts.

I tried that, it didn't help (Anyway OverlayDB is definitely not thread-safe currently and used from different threads)

gumb0 commented 6 years ago

I've added retrying GenericTrie::insert() in case of this assert failure - second attempt failes, too.

This means underlying OverlayDB really contains invalid trie at the moment, i.e. this is kind of evidence against data races during access to this node (but may be there was a race at the previous insert)

I want to try dumping contents of MemoryDB at the moment of failure and making a test out of it.

gumb0 commented 6 years ago

Ok, I've got enough evidence that this is the same Too many open files error as in https://github.com/ethereum/cpp-ethereum/issues/4493

What happens here is that DB lookup fails because of this error, then OverlayDB::lookup() considers any DB failure to be "key not found" (this should be fixed) https://github.com/ethereum/cpp-ethereum/blob/976309b3f559fb2973e885dee65b39652e8a2a29/libdevcore/OverlayDB.cpp#L133

Then to the TrieDB on top of OverlayDB it looks like a corrupted trie (some record missing in the database) and asserts fail