DigiByte-Core / digibyte

DigiByte Core 7.17.3 - CURRENT (5-12-2021) - 8.22.0 Development
https://digibyte.org
MIT License
98 stars 61 forks source link

WIP: Feature/8.22.0 Random Crash Bug & Missing Dandelion Relay TX Code #77

Closed JaredTate closed 8 months ago

JaredTate commented 2 years ago

Over the course of the past few weeks I have experienced a hard to track down bug where the v8.22.0 test client https://github.com/DigiByte-Core/digibyte/pull/63 randomly crashes. I have experienced this now on OS X & Ubuntu with multiple clients. After repeating the crash enough I have been able to track it down to what I believe is dandelion related issues when the client is asked to be part of a TX relay during the stem phase of a dandelion TX from a 7.17 client. The common error repeats itself 100s of time and looks like

2021-10-23T23:39:01Z received: inv (37 bytes) peer=33
2021-10-23T23:39:01Z got inv: dandeliontx ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  new peer=33
2021-10-23T23:39:01Z sending getdata (36 bytes) peer=33
2021-10-23T23:39:01Z sending inv (37 bytes) peer=33
2021-10-23T23:39:01Z received: getdata (36 bytes) peer=33
2021-10-23T23:39:01Z ProcessMessages(getdata, 36 bytes): Exception 'CDataStream::read(): end of data: iostream error' (NSt8ios_base7failureB5cxx11E) caught
2021-10-23T23:39:01Z received: inv (37 bytes) peer=33
2021-10-23T23:39:01Z got inv: dandeliontx ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff  new peer=33
2021-10-23T23:39:01Z sending getdata (36 bytes) peer=33
2021-10-23T23:39:01Z sending inv (37 bytes) peer=33
2021-10-23T23:39:01Z received: getdata (36 bytes) peer=33
2021-10-23T23:39:01Z ProcessMessages(getdata, 36 bytes): Exception 'CDataStream::read(): end of data: iostream error' (NSt8ios_base7failureB5cxx11E) caught

Here is a paste bin with 5000 lines of same error from DigiHash server: https://pastebin.com/buhkFkxM

Basically I have watch clients crash a dozen times or more with this being the main error message:

2021-10-23T23:39:01Z received: getdata (36 bytes) peer=33
2021-10-23T23:39:01Z ProcessMessages(getdata, 36 bytes): Exception 'CDataStream::read(): end of data: iostream error' (NSt8ios_base7failureB5cxx11E) caught

Only way to make sure you actually see that error is to have debug=1 in your digibyte.conf or -debug=1 on command line startup for digibyted or digibyte-qt.

What didn't make sense was while running 4 instances of DigiByteD on that server, only 1 at a time would randomly crash, which makes sense considering only the client involved in the relay of a dandelion tx during stem phase would be affected.

To further track down the issue we attempted to add "disabledandelion=1" to the DigiByte.conf only to find that no longer worked and the code to actually disable dandelion was missing. Upon diving further in it became apparent that RelayWalletTransaction is now SubmitMemoryPoolAndRelay.

So in new 8.22 code we have: https://github.com/DigiByte-Core/digibyte/blob/feature/8.22.0/src/wallet/wallet.cpp#L1712

bool CWalletTx::SubmitMemoryPoolAndRelay(std::string& err_string, bool relay)
{
    // Can't relay if wallet is not broadcasting
    if (!pwallet->GetBroadcastTransactions()) return false;
    // Don't relay abandoned transactions
    if (isAbandoned()) return false;
    // Don't try to submit coinbase transactions. These would fail anyway but would
    // cause log spam.
    if (IsCoinBase()) return false;
    // Don't try to submit conflicted or confirmed transactions.
    if (GetDepthInMainChain() != 0) return false;

    // Submit transaction to mempool for relay
    pwallet->WalletLogPrintf("Submitting wtx %s to mempool for relay\n", GetHash().ToString());
    // We must set fInMempool here - while it will be re-set to true by the
    // entered-mempool callback, if we did not there would be a race where a
    // user could call sendmoney in a loop and hit spurious out of funds errors
    // because we think that this newly generated transaction's change is
    // unavailable as we're not yet aware that it is in the mempool.
    //
    // Irrespective of the failure reason, un-marking fInMempool
    // out-of-order is incorrect - it should be unmarked when
    // TransactionRemovedFromMempool fires.
    bool ret = pwallet->chain().broadcastTransaction(tx, pwallet->m_default_max_tx_fee, relay, err_string);
    fInMempool |= ret;
    return ret;
}

However 7.17.3 code reads: https://github.com/DigiByte-Core/digibyte/blob/develop/src/wallet/wallet.cpp#L1828

bool CWalletTx::RelayWalletTransaction(CConnman* connman)
{
    assert(pwallet->GetBroadcastTransactions());
    if (!IsCoinBase() && !isAbandoned() && GetDepthInMainChain() == 0)
    {
        LogPrintf("Inside IF Test \n");
        CValidationState state;
        /* GetDepthInMainChain already catches known conflicts. */
        LogPrintf("Transaction in memory pool: %u, Accepted To MemoryPool: %u", InMempool(), AcceptToMemoryPool(maxTxFee, state));
        if (InMempool() || AcceptToMemoryPool(maxTxFee, state)) {
            pwallet->WalletLogPrintf("Relaying wtx %s\n", GetHash().ToString());
            if (connman) {
                if (!gArgs.GetBoolArg("-disabledandelion", DEFAULT_DISABLE_DANDELION)) {
                    int64_t nCurrTime = GetTimeMicros();
                    int64_t nEmbargo = 1000000*DANDELION_EMBARGO_MINIMUM+PoissonNextSend(nCurrTime, DANDELION_EMBARGO_AVG_ADD);
                    connman->insertDandelionEmbargo(GetHash(),nEmbargo);
                    LogPrint(BCLog::DANDELION, "dandeliontx %s embargoed for %d seconds\n", GetHash().ToString(), (nEmbargo-nCurrTime)/1000000);
                    CInv inv(MSG_DANDELION_TX, GetHash());
                    return connman->localDandelionDestinationPushInventory(inv);
                } else {
                    CInv inv(MSG_TX, GetHash());
                    connman->ForEachNode([&inv](CNode* pnode)
                    {
                        pnode->PushInventory(inv);
                    });
                    return true;
                }
            }
        }
    }
    return false;
}

So it appears the BTC devs reworked the relay TX code which breaks our dandelion implementation in an attempt to enhance privacy as seen here: https://github.com/bitcoin/bitcoin/issues/15734

And here: https://github.com/bitcoin/bitcoin/pull/16698

The original Dandelion implementation, which can be seen here by original developers has not been reworked since their 2018 update. The entire dandelion implementation can be seen here: https://github.com/dandelion-org/bitcoin/commit/d043e36bbe9249a78cf751c80b8d876b7d9f07ea

If we can re-enable "disabledandelion" this should prevent the random crash bug, however we need to fully fix the dandelion relay TX feature. All help with this is much appreciated. As stated above it has taken a couple weeks to get this far into understanding the problem, as it was not apparent at all in the beginning what was triggering random crashes.

This issue most likely will go a bit deeper, but it definitely seems to be an issue with relaying txs. I just had this error locally on my Mac client right before it crashed:


2021-10-24T16:11:53Z got inv: tx a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728  new peer=1
2021-10-24T16:11:53Z Requesting tx a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 peer=1
2021-10-24T16:11:53Z sending getdata (37 bytes) peer=1
2021-10-24T16:11:53Z received: tx (223 bytes) peer=1
2021-10-24T16:11:53Z Enqueuing TransactionAddedToMempool: txid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 wtxid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728
2021-10-24T16:11:53Z Blockpolicy error mempool tx a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 already being tracked
2021-10-24T16:11:53Z Enqueuing TransactionAddedToMempool: txid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 wtxid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728
2021-10-24T16:11:53Z AcceptToMemoryPool: peer=1: accepted a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 (poolsz 1 txn, 1 kB)
2021-10-24T16:11:53Z TransactionAddedToMempool: txid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 wtxid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728
2021-10-24T16:11:53Z TransactionAddedToMempool: txid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728 wtxid=a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728
2021-10-24T16:11:54Z sending inv (37 bytes) peer=14
2021-10-24T16:11:54Z sending inv (37 bytes) peer=4
2021-10-24T16:11:54Z received: inv (37 bytes) peer=8
2021-10-24T16:11:54Z got inv: tx a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728  have peer=8
2021-10-24T16:11:55Z sending inv (37 bytes) peer=13
2021-10-24T16:11:55Z received: ping (8 bytes) peer=1
2021-10-24T16:11:55Z sending pong (8 bytes) peer=1
2021-10-24T16:11:55Z sending inv (37 bytes) peer=5
2021-10-24T16:11:56Z received: inv (73 bytes) peer=10
2021-10-24T16:11:56Z got inv: tx a938e42a4c436db4c0bdf0c8b6dafc05d274404451e2a12070031707eb558728  have peer=10
2021-10-24T16:11:56Z got inv: tx ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f  new peer=10
2021-10-24T16:11:56Z Requesting tx ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f peer=10
2021-10-24T16:11:56Z sending getdata (37 bytes) peer=10
2021-10-24T16:11:56Z sending inv (37 bytes) peer=9
2021-10-24T16:11:56Z received: tx (259 bytes) peer=10
2021-10-24T16:11:56Z Enqueuing TransactionAddedToMempool: txid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f wtxid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f
2021-10-24T16:11:56Z Blockpolicy error mempool tx ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f already being tracked
2021-10-24T16:11:56Z Enqueuing TransactionAddedToMempool: txid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f wtxid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f
2021-10-24T16:11:56Z AcceptToMemoryPool: peer=10: accepted ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f (poolsz 2 txn, 2 kB)
2021-10-24T16:11:56Z TransactionAddedToMempool: txid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f wtxid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f
2021-10-24T16:11:56Z TransactionAddedToMempool: txid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f wtxid=ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f
2021-10-24T16:11:56Z received: inv (37 bytes) peer=8
2021-10-24T16:11:56Z got inv: tx ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f  have peer=8
2021-10-24T16:11:57Z sending inv (37 bytes) peer=13
2021-10-24T16:11:57Z sending inv (37 bytes) peer=14
2021-10-24T16:11:57Z received: inv (37 bytes) peer=9
2021-10-24T16:11:57Z got inv: tx ec35e6a0b9cf38e30ee1179b21b53b099856ce648ca1ff3eb3f24ebf9b67294f  have peer=9
2021-10-24T16:11:58Z sending inv (37 bytes) peer=4
2021-10-24T16:11:58Z received: getdata (37 bytes) peer=13
2021-10-24T16:11:58Z received getdata (1 invsz) peer=13
2021-10-24T16:11:58Z received getdata for: witness-tx 8b9f8731d0337676c1c9ed56177ca175bf0e4255f4e080f73eb5774b50d14e69 peer=13
barrystyle commented 2 years ago

@JaredTate check node/transaction.cpp which is where the majority of the mempool submit code got moved to. i've also noticed that the txid/wtxid of a given transaction are quite often the same; which isn't correct (there are actually assert conditions throughout the code if this behaviour is seen).

barrystyle commented 2 years ago

Screenshot from 2021-10-25 13-49-26

JaredTate commented 2 years ago

Barry, appreciate you helping take a look at this. I see you are using gdb to debug, which reminded me of something good to share with the community.

I wanted to share a debugging guide that was original created for Bitcoin but is very applicable to DGB. Depending on wether you are on Linux or Mac this guide is geared towards Mac. Basically its important to enable debugging both during compiling as well as while running DigiByted / DigiByte-Qt. Typically once you get an indicator of where an error might be by looking at the debug.log in the default DGB directory then you can take a deeper look with lldb(on mac) by setting break points. Or taking a look with valgrind at memory leaks or enabling core dumps on macOS. However without having a general idea of where to look from debug.log valgrind & lldb can spit out a lot of unhelpful info to sort through.

It is useful to run these tools with the unit tests and functional tests. Which is even more of a reason we need to make sure all tests are updated. Idea is to step through the problems to try and debug, which can be challenging depending on whats going on. You can see this guide here: https://gist.github.com/fjahr/2cd23ad743a2ddfd4eed957274beca0f

JaredTate commented 2 years ago

After digging deeper, it appears in 2019 this BTC pull request and changes are the crux of this problem I believe. https://github.com/bitcoin/bitcoin/pull/15713/files

Basically AcceptToMemoryPool and RelayWalletTransaction were combined to create SubmitMemoryPoolAndRelay. These changes inadvertently disabled our ability to disable dandelion at all, and might be causing other issues with our implementation and relaying dandelion txs, as well as other txs. There was a lot of relay rework done, but the above BTC PR is a good place to find where to look and see where else things might be breaking as well.

-disabledandelion & DEFAULT_DISABLE_DANDELION are completely missing from any part of the current 8.22.0 code.

Our 7.17.3 RelayWalletTransaction looks as follows: https://github.com/DigiByte-Core/digibyte/blob/develop/src/wallet/wallet.cpp#L1828

bool CWalletTx::RelayWalletTransaction(CConnman* connman)
{
    assert(pwallet->GetBroadcastTransactions());
    if (!IsCoinBase() && !isAbandoned() && GetDepthInMainChain() == 0)
    {
        LogPrintf("Inside IF Test \n");
        CValidationState state;
        /* GetDepthInMainChain already catches known conflicts. */
        LogPrintf("Transaction in memory pool: %u, Accepted To MemoryPool: %u", InMempool(), AcceptToMemoryPool(maxTxFee, state));
        if (InMempool() || AcceptToMemoryPool(maxTxFee, state)) {
            pwallet->WalletLogPrintf("Relaying wtx %s\n", GetHash().ToString());
            if (connman) {
                if (!gArgs.GetBoolArg("-disabledandelion", DEFAULT_DISABLE_DANDELION)) {
                    int64_t nCurrTime = GetTimeMicros();
                    int64_t nEmbargo = 1000000*DANDELION_EMBARGO_MINIMUM+PoissonNextSend(nCurrTime, DANDELION_EMBARGO_AVG_ADD);
                    connman->insertDandelionEmbargo(GetHash(),nEmbargo);
                    LogPrint(BCLog::DANDELION, "dandeliontx %s embargoed for %d seconds\n", GetHash().ToString(), (nEmbargo-nCurrTime)/1000000);
                    CInv inv(MSG_DANDELION_TX, GetHash());
                    return connman->localDandelionDestinationPushInventory(inv);
                } else {
                    CInv inv(MSG_TX, GetHash());
                    connman->ForEachNode([&inv](CNode* pnode)
                    {
                        pnode->PushInventory(inv);
                    });
                    return true;
                }
            }
        }
    }
    return false;
}

Where as our 7.17.3 AcceptToMemoryPool looks like: https://github.com/DigiByte-Core/digibyte/blob/develop/src/wallet/wallet.cpp#L4429

bool CWalletTx::AcceptToMemoryPool(const CAmount& nAbsurdFee, CValidationState& state)
{
    // We must set fInMempool here - while it will be re-set to true by the
    // entered-mempool callback, if we did not there would be a race where a
    // user could call sendmoney in a loop and hit spurious out of funds errors
    // because we think that this newly generated transaction's change is
    // unavailable as we're not yet aware that it is in the mempool.
    bool ret;
    if (!gArgs.GetBoolArg("-disabledandelion", DEFAULT_DISABLE_DANDELION)) {
        ret = ::AcceptToMemoryPool(stempool, state, tx, nullptr /* pfMissingInputs */,
                                   nullptr /* plTxnReplaced */, false /* bypass_limits */, nAbsurdFee);
    } else {
        ret = ::AcceptToMemoryPool(mempool, state, tx, nullptr /* pfMissingInputs */,
                                   nullptr /* plTxnReplaced */, false /* bypass_limits */, nAbsurdFee);
        // Changes to mempool should also be made to Dandelion stempool
        CValidationState dummyState;
        ret = ::AcceptToMemoryPool(stempool, dummyState, tx, nullptr /* pfMissingInputs */,
                                   nullptr /* plTxnReplaced */, false /* bypass_limits */, nAbsurdFee);
    }
    fInMempool |= ret;
    return ret;
}

Our current v8.22.0 SubmitMemoryPoolAndRelay for sure needs rewritten to allow us to support & disable dandelion, and get relays functioning properly. https://github.com/DigiByte-Core/digibyte/blob/feature/8.22.0/src/wallet/wallet.cpp#L1712

bool CWalletTx::SubmitMemoryPoolAndRelay(std::string& err_string, bool relay)
{
    // Can't relay if wallet is not broadcasting
    if (!pwallet->GetBroadcastTransactions()) return false;
    // Don't relay abandoned transactions
    if (isAbandoned()) return false;
    // Don't try to submit coinbase transactions. These would fail anyway but would
    // cause log spam.
    if (IsCoinBase()) return false;
    // Don't try to submit conflicted or confirmed transactions.
    if (GetDepthInMainChain() != 0) return false;

    // Submit transaction to mempool for relay
    pwallet->WalletLogPrintf("Submitting wtx %s to mempool for relay\n", GetHash().ToString());
    // We must set fInMempool here - while it will be re-set to true by the
    // entered-mempool callback, if we did not there would be a race where a
    // user could call sendmoney in a loop and hit spurious out of funds errors
    // because we think that this newly generated transaction's change is
    // unavailable as we're not yet aware that it is in the mempool.
    //
    // Irrespective of the failure reason, un-marking fInMempool
    // out-of-order is incorrect - it should be unmarked when
    // TransactionRemovedFromMempool fires.
    bool ret = pwallet->chain().broadcastTransaction(tx, pwallet->m_default_max_tx_fee, relay, err_string);
    fInMempool |= ret;
    return ret;
}
JaredTate commented 2 years ago

After diving in more I have 100% been able to confirm this is a transaction relay bug. There is an option in the wallet to add on startup or in the digibyte.conf called "blocksonly" which also enables "disablebroadcast" which prevents the client from relaying any transactions and only relaying completed blocks. By adding blocksonly=1 to the digibyte.conf on all 4 mining algos on DigiHash the client has not crashed and has been stable for almost 24 hours now, where as they usually crashed within an hour. Same with my local client. This confirms its a tx relay issue.

I have also ran a full LLDB frame backtrace which provided some additional insight.

(lldb) up
frame #1: 0x000000010121d3f6 digibyted`SipHashUint256(k0=17938932625010577665, k1=10144771775799500642, val=0x0000000000000038) at siphash.cpp:97:22
   94   uint64_t SipHashUint256(uint64_t k0, uint64_t k1, const uint256& val)
   95   {
   96       /* Specialized implementation for efficiency */
-> 97       uint64_t d = val.GetUint64(0);
   98   
   99       uint64_t v0 = 0x736f6d6570736575ULL ^ k0;
   100      uint64_t v1 = 0x646f72616e646f6dULL ^ k1;
(lldb) up
frame #2: 0x000000010090a286 digibyted`SaltedTxidHasher::operator(this=0x00000001123d0838, txid=0x0000000000000038)(uint256 const&) const at hasher.h:22:16
   19       SaltedTxidHasher();
   20   
   21       size_t operator()(const uint256& txid) const {
-> 22           return SipHashUint256(k0, k1, txid);
   23       }
   24   };
   25   
(lldb) up
frame #3: 0x0000000100549bd0 digibyted`boost::multi_index::detail::hashed_index_iterator<boost::multi_index::detail::hashed_index_node<boost::multi_index::detail::hashed_index_node<boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > > > >, boost::multi_index::detail::bucket_array<std::__1::allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_unique_tag, boost::multi_index::detail::hashed_index_global_iterator_tag> boost::multi_index::detail::hashed_index<mempoolentry_txid, SaltedTxidHasher, std::__1::equal_to<uint256>, boost::multi_index::detail::nth_layer<1, CTxMemPoolEntry, boost::multi_index::indexed_by<boost::multi_index::hashed_unique<mempoolentry_txid, SaltedTxidHasher, mpl_::na, mpl_::na>, boost::multi_index::hashed_unique<boost::multi_index::tag<index_by_wtxid, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mempoolentry_wtxid, SaltedTxidHasher, mpl_::na>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<descendant_score, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByDescendantScore>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<entry_time, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByEntryTime>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<ancestor_score, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByAncestorFee>, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, std::__1::allocator<CTxMemPoolEntry> >, boost::mpl::vector0<mpl_::na>, boost::multi_index::detail::hashed_unique_tag>::find<uint256, SaltedTxidHasher, std::__1::equal_to<uint256> >(this=0x00000001123d07e0, k=0x0000000000000038, hash=0x00000001123d0838, eq=0x00000001123d0848, (null)=mpl_::false_ @ 0x00007000101571b8) const at hashed_index.hpp:1605:38
   1602     const CompatibleKey& k,
   1603     const CompatibleHash& hash,const CompatiblePred& eq,mpl::false_)const
   1604   {
-> 1605     std::size_t buc=buckets.position(hash(k));
   1606     for(node_impl_pointer x=buckets.at(buc)->prior();
   1607         x!=node_impl_pointer(0);x=node_alg::next_to_inspect(x)){
   1608       if(eq(k,key(index_node_type::from_impl(x)->value()))){
(lldb) up
frame #4: 0x0000000100549b3f digibyted`boost::multi_index::detail::hashed_index_iterator<boost::multi_index::detail::hashed_index_node<boost::multi_index::detail::hashed_index_node<boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > > > >, boost::multi_index::detail::bucket_array<std::__1::allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_unique_tag, boost::multi_index::detail::hashed_index_global_iterator_tag> boost::multi_index::detail::hashed_index<mempoolentry_txid, SaltedTxidHasher, std::__1::equal_to<uint256>, boost::multi_index::detail::nth_layer<1, CTxMemPoolEntry, boost::multi_index::indexed_by<boost::multi_index::hashed_unique<mempoolentry_txid, SaltedTxidHasher, mpl_::na, mpl_::na>, boost::multi_index::hashed_unique<boost::multi_index::tag<index_by_wtxid, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mempoolentry_wtxid, SaltedTxidHasher, mpl_::na>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<descendant_score, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByDescendantScore>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<entry_time, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByEntryTime>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<ancestor_score, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByAncestorFee>, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, std::__1::allocator<CTxMemPoolEntry> >, boost::mpl::vector0<mpl_::na>, boost::multi_index::detail::hashed_unique_tag>::find<uint256, SaltedTxidHasher, std::__1::equal_to<uint256> >(this=0x00000001123d07e0, k=0x0000000000000038, hash=0x00000001123d0838, eq=0x00000001123d0848) const at hashed_index.hpp:541:12
   538      const CompatibleKey& k,
   539      const CompatibleHash& hash,const CompatiblePred& eq)const
   540    {
-> 541      return find(
   542        k,hash,eq,promotes_1st_arg<CompatiblePred,CompatibleKey,key_type>());
   543    }
   544  
(lldb) up
frame #5: 0x0000000100547b11 digibyted`boost::multi_index::detail::hashed_index_iterator<boost::multi_index::detail::hashed_index_node<boost::multi_index::detail::hashed_index_node<boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::null_augment_policy, boost::multi_index::detail::index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > > > >, boost::multi_index::detail::bucket_array<std::__1::allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_unique_tag, boost::multi_index::detail::hashed_index_global_iterator_tag> boost::multi_index::detail::hashed_index<mempoolentry_txid, SaltedTxidHasher, std::__1::equal_to<uint256>, boost::multi_index::detail::nth_layer<1, CTxMemPoolEntry, boost::multi_index::indexed_by<boost::multi_index::hashed_unique<mempoolentry_txid, SaltedTxidHasher, mpl_::na, mpl_::na>, boost::multi_index::hashed_unique<boost::multi_index::tag<index_by_wtxid, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mempoolentry_wtxid, SaltedTxidHasher, mpl_::na>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<descendant_score, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByDescendantScore>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<entry_time, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByEntryTime>, boost::multi_index::ordered_non_unique<boost::multi_index::tag<ancestor_score, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::multi_index::identity<CTxMemPoolEntry>, CompareTxMemPoolEntryByAncestorFee>, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, std::__1::allocator<CTxMemPoolEntry> >, boost::mpl::vector0<mpl_::na>, boost::multi_index::detail::hashed_unique_tag>::find<uint256>(this=0x00000001123d07e0, k=0x0000000000000038) const at hashed_index.hpp:531:12
   528    template<typename CompatibleKey>
   529    iterator find(const CompatibleKey& k)const
   530    {
-> 531      return find(k,hash_,eq_);
   532    }
   533  
   534    template<
(lldb) up
frame #6: 0x000000010084fe37 digibyted`CTxMemPool::GetIter(this=0x00000001123d0730, txid=0x0000000000000038) const at txmempool.cpp:903:21
   900  
   901  std::optional<CTxMemPool::txiter> CTxMemPool::GetIter(const uint256& txid) const
   902  {
-> 903      auto it = mapTx.find(txid);
   904      if (it != mapTx.end()) return it;
   905      return std::nullopt;
   906  }
(lldb) up
frame #7: 0x000000010034ae36 digibyted`(anonymous namespace)::PeerManagerImpl::ProcessGetData(this=0x00000001123d1480, pfrom=0x0000000106022e00, peer=0x00000002c5ffb658, interruptMsgProc=0x0000000105020fc8)::Peer&, std::__1::atomic<bool> const&) at net_processing.cpp:2012:41
   2009             std::vector<uint256> parent_ids_to_add;
   2010             {
   2011                 LOCK(m_mempool.cs);
-> 2012                 auto txiter = m_mempool.GetIter(tx->GetHash());
   2013                 if (txiter) {
   2014                     const CTxMemPoolEntry::Parents& parents = (*txiter)->GetMemPoolParentsConst();
   2015                     parent_ids_to_add.reserve(parents.size());
(lldb) lup
error: 'lup' is not a valid command.
(lldb) up
frame #8: 0x00000001002e9265 digibyted`(anonymous namespace)::PeerManagerImpl::ProcessMessage(this=0x00000001123d1480, pfrom=0x0000000106022e00, msg_type="getdata", vRecv=0x00000002ca3e4910, time_received=(__rep_ = 1635283113509758), interruptMsgProc=0x0000000105020fc8) at net_processing.cpp:3076:13
   3073         {
   3074             LOCK(peer->m_getdata_requests_mutex);
   3075             peer->m_getdata_requests.insert(peer->m_getdata_requests.end(), vInv.begin(), vInv.end());
-> 3076             ProcessGetData(pfrom, *peer, interruptMsgProc);
   3077         }
   3078 
   3079         return;
(lldb) up
frame #9: 0x00000001002f3f90 digibyted`(anonymous namespace)::PeerManagerImpl::ProcessMessages(this=0x00000001123d1480, pfrom=0x0000000106022e00, interruptMsgProc=0x0000000105020fc8) at net_processing.cpp:4232:9
   4229     unsigned int nMessageSize = msg.m_message_size;
   4230 
   4231     try {
-> 4232         ProcessMessage(*pfrom, msg_type, msg.m_recv, msg.m_time, interruptMsgProc);
   4233         if (interruptMsgProc) return false;
   4234         {
   4235             LOCK(peer->m_getdata_requests_mutex);
(lldb) up
frame #10: 0x00000001002f9880 digibyted`non-virtual thunk to (anonymous namespace)::PeerManagerImpl::ProcessMessages(CNode*, std::__1::atomic<bool>&) at net_processing.cpp:0
   1    // Copyright (c) 2009-2010 Satoshi Nakamoto
   2    // Copyright (c) 2009-2020 The Bitcoin Core developers
   3    // Copyright (c) 2014-2020 The DigiByte Core developers
   4    // Distributed under the MIT software license, see the accompanying
   5    // file COPYING or http://www.opensource.org/licenses/mit-license.php.
   6    
   7    #include <net_processing.h>
(lldb) up
frame #11: 0x0000000100228d0d digibyted`CConnman::ThreadMessageHandler(this=0x0000000105020c00) at net.cpp:2574:45
   2571                 continue;
   2572 
   2573             // Receive messages
-> 2574             bool fMoreNodeWork = m_msgproc->ProcessMessages(pnode, flagInterruptMsgProc);
   2575             fMoreWork |= (fMoreNodeWork && !pnode->fPauseSend);
   2576             if (flagInterruptMsgProc)
   2577                 return;
(lldb) up
frame #12: 0x00000001002ba616 digibyted`CConnman::Start(this=0x000070001015bee8)::$_15::operator()() const at net.cpp:2923:80
   2920     }
   2921 
   2922     // Process messages
-> 2923     threadMessageHandler = std::thread(&util::TraceThread, "msghand", [this] { ThreadMessageHandler(); });
   2924 
   2925     if (connOptions.m_i2p_accept_incoming && m_i2p_sam_session.get() != nullptr) {    
   2926         threadI2PAcceptIncoming =
(lldb) up
frame #13: 0x00000001002ba57b digibyted`decltype(__f=0x000070001015bee8)::$_15&>(fp)()) std::__1::__invoke<CConnman::Start(CScheduler&, CConnman::Options const&)::$_15&>(CConnman::Start(CScheduler&, CConnman::Options const&)::$_15&) at type_traits:3747:1
   3744 inline _LIBCPP_INLINE_VISIBILITY
   3745 auto
   3746 __invoke(_Fp&& __f, _Args&& ...__args)
-> 3747 _LIBCPP_INVOKE_RETURN(_VSTD::forward<_Fp>(__f)(_VSTD::forward<_Args>(__args)...))
   3748 
   3749 template <class _Fp, class ..._Args>
   3750 inline _LIBCPP_INLINE_VISIBILITY
(lldb) up
frame #14: 0x00000001002ba4cb digibyted`void std::__1::__invoke_void_return_wrapper<void>::__call<CConnman::Start(__args=0x000070001015bee8)::$_15&>(CConnman::Start(CScheduler&, CConnman::Options const&)::$_15&) at __functional_base:348:9
   345  #ifndef _LIBCPP_CXX03_LANG
   346      template <class ..._Args>
   347      static void __call(_Args&&... __args) {
-> 348          __invoke(_VSTD::forward<_Args>(__args)...);
   349      }
   350  #else
   351      template <class _Fn>
(lldb) up
frame #15: 0x00000001002ba47b digibyted`std::__1::__function::__alloc_func<CConnman::Start(CScheduler&, CConnman::Options const&)::$_15, std::__1::allocator<CConnman::Start(CScheduler&, CConnman::Options const&)::$_15>, void ()>::operator(this=0x000070001015bee8)() at functional:1553:16
   1550     _Rp operator()(_ArgTypes&&... __arg)
   1551     {
   1552         typedef __invoke_void_return_wrapper<_Rp> _Invoker;
-> 1553         return _Invoker::__call(__f_.first(),
   1554                                 _VSTD::forward<_ArgTypes>(__arg)...);
   1555     }
   1556 
(lldb) up
frame #16: 0x00000001002b7f30 digibyted`std::__1::__function::__func<CConnman::Start(CScheduler&, CConnman::Options const&)::$_15, std::__1::allocator<CConnman::Start(CScheduler&, CConnman::Options const&)::$_15>, void ()>::operator(this=0x000070001015bee0)() at functional:1727:12
   1724 _Rp
   1725 __func<_Fp, _Alloc, _Rp(_ArgTypes...)>::operator()(_ArgTypes&& ... __arg)
   1726 {
-> 1727     return __f_(_VSTD::forward<_ArgTypes>(__arg)...);
   1728 }
   1729 
   1730 #ifndef _LIBCPP_NO_RTTI
(lldb) up
frame #17: 0x000000010049afe3 digibyted`std::__1::__function::__value_func<void ()>::operator(this=0x000070001015bee0)() const at functional:1880:16
   1877     {
   1878         if (__f_ == 0)
   1879             __throw_bad_function_call();
-> 1880         return (*__f_)(_VSTD::forward<_ArgTypes>(__args)...);
   1881     }
   1882 
   1883     _LIBCPP_INLINE_VISIBILITY
(lldb) up
frame #18: 0x000000010049af77 digibyted`std::__1::function<void ()>::operator(this=0x000070001015bee0)() const at functional:2555:12
   2552 _Rp
   2553 function<_Rp(_ArgTypes...)>::operator()(_ArgTypes... __arg) const
   2554 {
-> 2555     return __f_(_VSTD::forward<_ArgTypes>(__arg)...);
   2556 }
   2557 
   2558 #ifndef _LIBCPP_NO_RTTI
JaredTate commented 2 years ago

After re-reading the Dandelion BIP a couple other things have stood out to me as well: https://github.com/bitcoin/bips/blob/master/bip-0156.mediawiki

Because the Embargo & Dandelion relay code to set an embargo is completely missing in SubmitMemoryPoolAndRelay, the wallet has no idea what to do with stempool transactions during the fluff phase, and the possibility exists they are never relaying and completely stall out. Meaning the dandelion TX would completely fail without adding an embargo. The embargo code is needed to ensure that if a client suddenly goes off line, after the embargo time expires all previous clients in the stem will go ahead and broadcast the tx as normal tx as their embargoes expire.

So this if not fixed right, it could cause dandelion TX's to completely fail without the embargo code.

During the stem phase, transactions are relayed along a single path. 
If any node in this path were to receive the Dandelion transaction and go offline, then the transaction would cease to propagate. 
To increase robustness, every node that forwards a Dandelion transaction initializes a timer at the time of reception.
 If the Dandelion transaction does not appear in the memory pool by the time the timer expires, 
 then the transaction enters fluff phase and is forwarded via diffusion.

When a Dandelion transaction arrives, the node MUST set an embargo timer for a random time in the future. 
If the Dandelion transaction arrives as a typical Bitcoin transaction, the node MUST cancel the timer. 
If the timer expires before the Dandelion transaction is observed as a typical Bitcoin transaction, 
then the node MUST fluff the Dandelion transaction.

So the code below from RelayWalletTransaction above is not only needed for disabling dandelion, but to provide the needed embargos for dandelion transactions. This needs to be reworked into SubmitMemoryPoolAndRelay. However that is proving more challenging that expected because of the BTC core rework of connman.

                if (!gArgs.GetBoolArg("-disabledandelion", DEFAULT_DISABLE_DANDELION)) {
                    int64_t nCurrTime = GetTimeMicros();
                    int64_t nEmbargo = 1000000*DANDELION_EMBARGO_MINIMUM+PoissonNextSend(nCurrTime, DANDELION_EMBARGO_AVG_ADD);
                    connman->insertDandelionEmbargo(GetHash(),nEmbargo);
                    LogPrint(BCLog::DANDELION, "dandeliontx %s embargoed for %d seconds\n", GetHash().ToString(), (nEmbargo-nCurrTime)/1000000);
                    CInv inv(MSG_DANDELION_TX, GetHash());
                    return connman->localDandelionDestinationPushInventory(inv);
                } else {
barrystyle commented 2 years ago

Have just noticed (https://github.com/DigiByte-Core/digibyte/blame/feature/8.22.0/src/net_processing.cpp#L2005) which appears to remove the given tx hash from mempool - then immediately afterwards is iterating an array searching for that tx hash (https://github.com/DigiByte-Core/digibyte/blame/feature/8.22.0/src/net_processing.cpp#L2012).

JaredTate commented 2 years ago

Barry, good find. That would explain the the bad access error.

(this=0x00000001123d07e0, k=0x0000000000000038) const at hashed_index.hpp:531:12
   528    template<typename CompatibleKey>
   529    iterator find(const CompatibleKey& k)const
   530    {
-> 531      return find(k,hash_,eq_);
   532    }
   533  
   534    template<
(lldb) up
frame #6: 0x000000010084fe37 digibyted`CTxMemPool::GetIter(this=0x00000001123d0730, txid=0x0000000000000038) const at txmempool.cpp:903:21
   900  
   901  std::optional<CTxMemPool::txiter> CTxMemPool::GetIter(const uint256& txid) const
   902  {
-> 903      auto it = mapTx.find(txid);
   904      if (it != mapTx.end()) return it;
   905      return std::nullopt;
   906  }
(lldb) up
frame #7: 0x000000010034ae36 digibyted`(anonymous namespace)::PeerManagerImpl::ProcessGetData(this=0x00000001123d1480, pfrom=0x0000000106022e00, peer=0x00000002c5ffb658, interruptMsgProc=0x0000000105020fc8)::Peer&, std::__1::atomic<bool> const&) at net_processing.cpp:2012:41
   2009             std::vector<uint256> parent_ids_to_add;
   2010             {
   2011                 LOCK(m_mempool.cs);
-> 2012                 auto txiter = m_mempool.GetIter(tx->GetHash());
   2013                 if (txiter) {
   2014                     const CTxMemPoolEntry::Parents& parents = (*txiter)->GetMemPoolParentsConst();
   2015                     parent_ids_to_add.reserve(parents.size());
(lldb) lup
error: 'lup' is not a valid command.
(lldb) up
frame #8: 0x00000001002e9265 digibyted`(anonymous namespace)::PeerManagerImpl::ProcessMessage(this=0x00000001123d1480, pfrom=0x0000000106022e00, msg_type="getdata", vRecv=0x00000002ca3e4910, time_received=(__rep_ = 1635283113509758), interruptMsgProc=0x0000000105020fc8) at net_processing.cpp:3076:13
   3073         {
   3074             LOCK(peer->m_getdata_requests_mutex);
   3075             peer->m_getdata_requests.insert(peer->m_getdata_requests.end(), vInv.begin(), vInv.end());
-> 3076             ProcessGetData(pfrom, *peer, interruptMsgProc);
   3077         }
   3078 
   3079         return;
(lldb) up
frame #9: 0x00000001002f3f90 digibyted`(anonymous namespace)::PeerManagerImpl::ProcessMessages(this=0x00000001123d1480, pfrom=0x0000000106022e00, interruptMsgProc=0x0000000105020fc8) at net_processing.cpp:4232:9
   4229     unsigned int nMessageSize = msg.m_message_size;
   4230 
   4231     try {
-> 4232         ProcessMessage(*pfrom, msg_type, msg.m_recv, msg.m_time, interruptMsgProc);
   4233         if (interruptMsgProc) return false;
   4234         {
   4235             LOCK(peer->m_getdata_requests_mutex);
(lldb) up

Also on a side note about my comment above, even with blocksonly enabled in digibyte.conf, people can still add forcerelay to their conf and client will still try to relay.

JaredTate commented 2 years ago

I also want to point out by reverting this commit: https://github.com/DigiByte-Core/digibyte/commit/d351fd0199fdde55e302ffd3667c428252993bee

Everything else seems to be stable and work in 8.22 now that I can test, except for dandelion. When I merged this PR, I compiled, tested and sent some txs. But I never let the client run long enough to hit this bug.

So if we are not able to get this sorted in the next week, we could take the dandelion work out and create a new feature branch just focused on 8.22 dandelion as there are a few things that need updated and reworked I believe. Then carry on testing everything else in 8.22 without dandelion.

There is definitely a reason all the BTC devs abandoned adding dandelion in 0.20, .21 and v22. There is a lot of rework done with mempool and tx relays, so this is quite a dynamic problem to solve.

I am confident we all can get through this and get dandelion working as expected before a full 8.22 release though.

When we get through this, there is a chance we could even structure a PR and submit a working dandelion to BTC core and get some collaboration going with BTC devs.

barrystyle commented 2 years ago

I also want to point out by reverting this commit: d351fd0

Everything else seems to be stable and work in 8.22 now that I can test, except for dandelion. When I merged this PR, I compiled, tested and sent some txs. But I never let the client run long enough to hit this bug.

So if we are not able to get this sorted in the next week, we could take the dandelion work out and create a new feature branch just focused on 8.22 dandelion as there are a few things that need updated and reworked I believe. Then carry on testing everything else in 8.22 without dandelion.

There is definitely a reason all the BTC devs abandoned adding dandelion in 0.20, .21 and v22. There is a lot of rework done with mempool and tx relays, so this is quite a dynamic problem to solve.

I am confident we all can get through this and get dandelion working as expected before a full 8.22 release though.

When we get through this, there is a chance we could even structure a PR and submit a working dandelion to BTC core and get some collaboration going with BTC devs.

I feel the mempool stuff has not been migrated across correctly in the merge, as it should strictly be a function accessible via *node, not referenced as a global. Additionally quite a few asserts are missing on the main functions in validation.cpp, which in themselves cause compilation issues under clang (theyve been removed?).

I will submit another PR with the change i've suggested.

JaredTate commented 2 years ago

All help is much appreciated Barry. Thanks for taking a crack at this. This is a challenging problem.

There has been a massive amount of mempool & p2p rework done in BTC core since 0.17 as well as other things critical to original dandelion implementation. (We are effectively jumping from BTC 0.17 to BTC v22... so all BTC changes up to that point are to be dealt with to get a properly working dandelion client).

Reading through the BTC v22 release notes again and I can see several changes in net_processing that might be the root of some of these issues. https://github.com/bitcoin/bitcoin/blob/master/doc/release-notes/release-notes-22.0.md

This one is a biggy: bitcoin/bitcoin#21162 Net Processing: Move RelayTransaction() into PeerManager (jnewbery)

As is this one: bitcoin/bitcoin#21160 #net/net processing: Move tx inventory into net_processing

Others include: bitcoin/bitcoin#20972 Locks: Annotate CTxMemPool::check to require cs_main (dongcarl) bitcoin/bitcoin#22415 Make m_mempool optional in CChainState (jamesob) bitcoin/bitcoin#19771 Replace enum CConnMan::NumConnections with enum class ConnectionDirection (luke-jr) bitcoin/bitcoin#20162 p2p: declare Announcement::m_state as uint8_t, add getter/setter (jonatack) bitcoin/bitcoin#21186 net/net processing: Move addr data into net_processing (jnewbery) bitcoin/bitcoin#21187 Net processing: Only call PushAddress() from net_processing (jnewbery) bitcoin/bitcoin#22013 ignore block-relay-only peers when skipping DNS seed (ajtowns) bitcoin/bitcoin#22261 Two small fixes to node broadcast logic (jnewbery)

This one definitely caused some compiling issues in dandelion: bitcoin/bitcoin#21015 Make all of net_processing (and some of net) use std::chrono types (dhruv)

Its apparent for sure part of dandelion needs rewritten and tweaked to function properly with all these BTC changes.

ycagel commented 8 months ago

This issue can be closed per my conversation with @JaredTate. The crash bug has been resolved.

JaredTate commented 8 months ago

Yeah good to close.