dogecoin / dogecoin

very currency
MIT License
14.4k stars 2.8k forks source link

[feat] Allow rescan on pruned nodes #3518

Open chromatic opened 1 month ago

chromatic commented 1 month ago

Currently, rescan fails on pruned nodes with an error. It should be possible to rescan a pruned node if an optional height is provided and that height is within the blocks contained within the pruned node.

victorsk2019 commented 4 weeks ago

Hi,

This looks like a very useful enhancement for pruned blockchains. I've analysed the requirements for a bit and, as a test, added preliminary test feature to force re-scan for pruned blockchain and make pindexRescan get its value from FindForkInGlobalIndex() function, as it does in other parts of wallet.cpp, like so:

On line 37894 of wallet.cpp:

else if ((fPruneMode) && GetBoolArg("-rescan", false)) { ~~CBlockLocator locator; pindexRescan = FindForkInGlobalIndex(chainActive, locator);~~ chainActive.SetTip(pindexRescan->GetAncestor(5050000)); pindexRescan = chainActive.Tip(); //here, 5050000 is hard-coded optional test height as passed in command line argument but one block under because its an ancestor? }
Then, after further analysis, there are additional conditions related to wallet's synchronization timestamp (if it was synchronized before new pruning was applied) and so this makes this issue more complex than I originally assumed and so it's probably better for someone who knows all the intricacies of this implementation inside- out like @patricklodder? Thanks!

victorsk2019 commented 3 weeks ago

Hi,

I did some more work on this issue and came up with draft solution. For basic functionality, as per requirement to work with optional height, rescanning worked for pruned node, with simple error-checking. I tested on 2.2GB pruned node:

if (GetBoolArg("-rescan", false) && !fPruneMode) //note I added condition to exclude pruned mode
        pindexRescan = chainActive.Genesis();

else if (fPruneMode && GetBoolArg("-rescan", false))
{
        //set temp_chain to work with pruned blockchain
        CChain temp_chain = chainActive;
        uint64_t height = std::max<int64_t>(0, GetArg("-height", 0));

        if (height == 0)
        {
            InitError(_("Prune: wallet synchronisation goes beyond pruned data. You need to -reindex (download the whole blockchain again in case of pruned node)"));
            return NULL;
        }

        //set new tip for pruned blockchain based on given height
        //and assign it to pindexRescan
        temp_chain.SetTip(pindexRescan->GetAncestor(height));
        pindexRescan = temp_chain.Tip();
        LogPrintf("Pruned height argument: %i \n", height);
        LogPrintf("Pruned pindexRescan height: %i \n", pindexRescan->nHeight);
        LogPrintf("temp_chain height: %i \n", temp_chain.Tip()->nHeight);
}

Is this something that makes sense and can be submitted as a PR? Thanks.

Of course, disable this condition and error message:

/*
if (GetArg("-prune", 0) && GetBoolArg("-rescan", false))
return InitError(_("Rescans are not possible in pruned mode. 
You will need to use -reindex which will download the whole blockchain again."));
*/
chromatic commented 3 weeks ago

Interesting approach! What happens if the ancestor is not accessible? For example, if you keep the last 1000 blocks and you try to rescan from a block older than is on disk?

I had trouble implementing this myself, which is why I ask and why I haven't submitted my own PR.

victorsk2019 commented 3 weeks ago

What happens if the ancestor is not accessible? For example, if you keep the last 1000 blocks and you try to rescan from a block older than is on disk?

If I understood the problem correctly, in this case we get an error: Prune: last wallet synchronisation goes beyond pruned data. You need to -reindex (download the whole blockchain again in case of pruned node)

I think that due to the nature of pruned blockchain it won't be possible to access wallet transactions older than those stored in blocks available on disk (due to them being pruned) and so the -height argument helps to get the blocks which are both in wallet and on disk.

When I ran rescan with my implementation (on a blockchain that I am also synchronizing because I fell behind on that): ./dogecoind -prune=2201 -height=5070000 -rescan & I get the following in the logs:

2024-04-21 17:30:04  wallet                   97ms
2024-04-21 17:30:04 Pruned height argument: 5070000
2024-04-21 17:30:04 Pruned pindexRescan height: 5070000
2024-04-21 17:30:04 temp_chain height: 5070000
2024-04-21 17:30:04 init message: Rescanning...
2024-04-21 17:30:04 Rescanning last 1328 blocks (from block 5070000)...
2024-04-21 17:30:17  rescan                12338ms
2024-04-21 17:30:17 setKeyPool.size() = 1998
2024-04-21 17:30:17 mapWallet.size() = 7
2024-04-21 17:30:17 mapAddressBook.size() = 9
2024-04-21 17:30:17 Unsetting NODE_NETWORK on prune mode
2024-04-21 17:30:17 init message: Pruning blockstore...
2024-04-21 17:30:17 mapBlockIndex.size() = 5180635
2024-04-21 17:30:17 nBestHeight = 5071328
2024-04-21 17:30:17 init message: Loading addresses...
victorsk2019 commented 3 weeks ago

Hi,

May I propose a minor variant on my earlier approach? Rather than having user specify height from which to begin wallet rescan, if user doesn't provide height argument but wants a rescan, we can "rewind" block index of pruned blockchain to earliest possible starting index and use it as a starting point "tip" for rescanning a wallet. The advantage of this approach is I think this may prevent going beyond pruned data error because starting block index will prevent it.

Here are the changes I am proposing: In wallet.h add public static member function: static CBlockIndex* GetBlockIndex();

In wallet.cpp:

else if (fPruneMode && GetBoolArg("-rescan", false))
{
        //set temp_chain to work with pruned blockchain
        CChain temp_chain = chainActive;
        uint64_t height = std::max<int64_t>(0, GetArg("-height", 0));

        if (height == 0)
        {
            //InitError(_("Prune: wallet synchronisation goes beyond pruned data. You need to -reindex (download the whole blockchain again in case of pruned node)"));
            //return NULL;
            LogPrintf("Pruned block index for starting rescan: %i \n", CWallet::GetBlockIndex()->nHeight);
            temp_chain.SetTip(pindexRescan->GetAncestor(CWallet::GetBlockIndex()->nHeight));
            pindexRescan = temp_chain.Tip();
            LogPrintf("Pruned height argument: %i \n", height);
            LogPrintf("Pruned pindexRescan height: %i \n", pindexRescan->nHeight);
            LogPrintf("temp_chain height: %i \n", temp_chain.Tip()->nHeight);
        }

        else
        {
            //set new tip for pruned blockchain based on given height
            //and assign it to pindexRescan
            temp_chain.SetTip(pindexRescan->GetAncestor(height));
            pindexRescan = temp_chain.Tip();
            LogPrintf("Pruned height argument: %i \n", height);
            LogPrintf("Pruned pindexRescan height: %i \n", pindexRescan->nHeight);
            LogPrintf("temp_chain height: %i \n", temp_chain.Tip()->nHeight);
        }
}
...

CBlockIndex* CWallet::GetBlockIndex()
{
    CBlockIndex *block = chainActive.Tip();

    while (block && block->pprev && (block->pprev->nStatus & BLOCK_HAVE_DATA) && block->pprev->nTx > 0)
                block = block->pprev;

    return block;
}

I ran modified code with the following command-line arguments and got the following printout: ./dogecoind -prune=2201 -rescan &

2024-04-24 19:11:11 Pruned block index for starting rescan: 4840261 
2024-04-24 19:11:11 Pruned height argument: 0
2024-04-24 19:11:11 Pruned pindexRescan height: 4840261 
2024-04-24 19:11:11 temp_chain height: 4840261 
2024-04-24 19:11:11 init message: Rescanning...
2024-04-24 19:11:11 Rescanning last 64949 blocks (from block 4840261)...
2024-04-24 19:12:11 Still rescanning. At block 4891360. Progress=0.653754
2024-04-24 19:12:21  rescan                70526ms
2024-04-24 19:12:21 setKeyPool.size() = 1998
2024-04-24 19:12:21 mapWallet.size() = 7
2024-04-24 19:12:21 mapAddressBook.size() = 9
2024-04-24 19:12:21 Unsetting NODE_NETWORK on prune mode
2024-04-24 19:12:21 init message: Pruning blockstore...
2024-04-24 19:12:21 mapBlockIndex.size() = 5184841
2024-04-24 19:12:21 nBestHeight = 4905210
2024-04-24 19:12:21 init message: Loading addresses...

Note: here, I have not fully synchronized pruned blockchain which I had to restore from backup. Thanks.