UkoeHB / monero

Monero: the secure, private, untraceable cryptocurrency
https://getmonero.org
Other
7 stars 4 forks source link

Minimizing round trips to the daemon on wallet load #41

Open j-berman opened 5 months ago

j-berman commented 5 months ago

Continuing from point 2 in this comment in the async scanner: https://github.com/UkoeHB/monero/pull/23#issuecomment-2036086371

Ideally wallets minimize round trips to the daemon when loading. This was the idea behind returning pool txs in the /getblocks.bin endpoint: https://github.com/monero-project/monero/pull/8076

The Seraphis lib scanner currently does 1) scan chain -> 2) scan pool -> 3) scan chain in a scan pass, which would require a minimum of 3 round trips to the daemon. @UkoeHB mentioned:

I intentionally added a follow-up pass because there is a race condition between chain scanning and pool scanning that can cause mined blocks to contain txs invisible to a wallet even after it claims to have scanned the pool.

Proposed solution to this problem: ensure that when the /getblocks.bin endpoint returns pool txs, those pool txs correspond to the state of the pool at the top_block_hash also returned by the endpoint (to be clear, this would be a change to the daemon RPC side). This way the scanner can be certain that once it has scanned up to chain tip, it knows the state of the pool at that time.

j-berman commented 5 months ago

(obviously the state of the pool isn't fixed to a block hash, but since txs can't be both in the pool and the chain, then this should work)

j-berman commented 5 months ago

Alternative that's probably simpler: scan pool first then chain. If any of the user's pool txs are identified in the chain, it can be inferred it's no longer in the pool. This is basically the same as wallet2 today.

UkoeHB commented 4 months ago

Alternative that's probably simpler: scan pool first then chain. If any of the user's pool txs are identified in the chain, it can be inferred it's no longer in the pool. This is basically the same as wallet2 today.

This doesn't work for long scan jobs because the pool view will be quite stale by the end.

j-berman commented 4 months ago

Seems acceptable because the pool just gets scanned on the next refresh loop

UkoeHB commented 4 months ago

I'd rather the scanner user control the features it uses, than for the scanner itself to be weakened. Can you accomplish the desired optimization with a config, or by simply ignoring the initial or follow-up scan?

j-berman commented 4 months ago

The primary function in question:

bool refresh_enote_store(const scanning::ScanMachineConfig &scan_machine_config,
    scanning::ScanContextNonLedger &nonledger_scan_context_inout,
    scanning::ScanContextLedger &ledger_scan_context_inout,
    scanning::ChunkConsumer &chunk_consumer_inout)
{
    // 1. perform a full scan
    if (!refresh_enote_store_ledger(scan_machine_config, ledger_scan_context_inout, chunk_consumer_inout))
        return false;

    // 2. perform an unconfirmed scan
    if (!refresh_enote_store_nonledger(SpEnoteOriginStatus::UNCONFIRMED,
            SpEnoteSpentStatus::SPENT_UNCONFIRMED,
            nonledger_scan_context_inout,
            chunk_consumer_inout))
        return false;

    // 3. perform a follow-up full scan
    // rationale:
    // - blocks may have been added between the initial on-chain pass and the unconfirmed pass, and those blocks may
    //   contain txs not seen by the unconfirmed pass (i.e. sneaky txs)
    // - we want scan results to be chronologically contiguous (it is better for the unconfirmed scan results to be stale
    //   than the on-chain scan results)
    if (!refresh_enote_store_ledger(scan_machine_config, ledger_scan_context_inout, chunk_consumer_inout))
        return false;

    return true;
}

How I'm thinking about options also based on some discussions we've had in the past:

Route 1 (scan chain then pool)

Pros

Cons

Route 2 (scan pool then chain)

Pros

Cons

Bonus Route 1a or Route 2b

Chunk chunk = scan_context.get_next_chunk()

if chunk == nonledger chunk
    handle nonledger chunk
else if chunk == ledger chunk
    handle ledger chunk (ensure contiguity to prior ledger chunk and consume)
else
    must be the terminal chunk, reached scanner end state

Edited to add clearer terminal chunk handling.

j-berman commented 4 months ago

Route 2 is probably simplest to implement imo

UkoeHB commented 4 months ago

Ok after reviewing the seraphis_lib implementation I remember why I wrote it like this. You don't need to use refresh_enote_store. Notice that it's in the file scan_process_basic.h/.cpp. The idea is you can write any high-level function you want to orchestrate scanning using the scan-machine API. So I'd just do option 2 with a new function.