Neptune-Crypto / neptune-core

anonymous peer-to-peer cash
Apache License 2.0
28 stars 8 forks source link

dashboard_overview_data() is often quite slow. #244

Open dan-da opened 4 hours ago

dan-da commented 4 hours ago

I have instrumented dashboard_overview_data() to log duration of most of the functions it calls.

Below are warnings from a recent invocation that lasted 4.3 seconds.

setup:

2024-11-16T02:42:45.616949478Z  WARN ThreadId(02) RPC{rpc.trace_id=00 rpc.deadline=2024-11-16T02:42:55.604790658Z otel.kind="server" otel.name="RPC.d
ashboard_overview_data"}: neptune_core: executed dashboard_overview_data()::cpu_temp_inner() in 0.007878622 secs.  exceeds slow fn threshold of 0.001 secs.  location: /home/danda/neptune-core/src/rpc_server.rs:826:13

2024-11-16T02:42:47.129279367Z  WARN ThreadId(02) RPC{rpc.trace_id=00 rpc.deadline=2024-11-16T02:42:55.604790658Z otel.kind="server" otel.name="RPC.dashboard_overview_data"}: neptune_core: executed dashboard_overview_data()::unconfirmed_balance() in 1.512249878 secs.  exceeds slow fn threshold of 0.001 secs.  location: /home/danda/neptune-core/src/rpc_server.rs:830:13

2024-11-16T02:42:48.548855423Z  WARN ThreadId(02) RPC{rpc.trace_id=00 rpc.deadline=2024-11-16T02:42:55.604790658Z otel.kind="server" otel.name="RPC.dashboard_overview_data"}: neptune_core: executed dashboard_overview_data()::synced_unspent_available_amount() in 1.419075826 secs.  exceeds slow fn threshold of 0.001 secs.  location: /home/danda/neptune-core/src/rpc_server.rs:850:13

2024-11-16T02:42:49.96716567Z  WARN ThreadId(02) RPC{rpc.trace_id=00 rpc.deadline=2024-11-16T02:42:55.604790658Z otel.kind="server" otel.name="RPC.dashboard_overview_data"}: neptune_core: executed dashboard_overview_data()::synced_unspent_timelocked_amount() in 1.418132393 secs.  exceeds slow fn threshold of 0.001 secs.  location: /home/danda/neptune-core/src/rpc_server.rs:854:13

2024-11-16T02:42:49.967598474Z  WARN ThreadId(02) RPC{rpc.trace_id=00 rpc.deadline=2024-11-16T02:42:55.604790658Z otel.kind="server" otel.name="RPC.dashboard_overview_data"}: neptune_core: executed dashboard_overview_data() in 4.361915307 secs.  exceeds slow fn threshold of 0.001 secs.  location: /home/danda/neptune-core/src/rpc_server.rs:799:9

analysis:

  1. cpu_temp_inner() performs blocking IO (reads from filesystem). It could be placed in a spawn_blocking. That would help concurrency, but not RPC response time. Better yet, I think it should be removed from the dashboard overview RPC, and from neptune-core entirely. Since the dashboard and neptune-core are supposed to reside on the same machine, I think the temp stuff should be client-side. For more advanced usage, plenty of remote monitoring solutions exist.
  2. the bulk of the time is used by unconfirmed_balance(), synced_unspent_available_amount() and synced_unspent_timelocked_amount(), with each one around 1.5 seconds. It is concerning that these functions are so slow even when the wallet doesn't hold any utxos. read-lock is already held by the rpc method, so acquisition time is not included in these numbers. anyway, digging into these shoudl be fruitful.
Sword-Smith commented 3 hours ago

Regarding point 2, I'm pretty sure the culprit is in Utxo::can_spend_at. Everytime we calculate the hash of a program, the program's assembler code is assembled and the resulting list of instructions hashed. Especially the assembling is expensive. The relevant trait implementations are on ConsensusProgram:

    /// Get the program as a `Program` object rather than as a list of `LabelledInstruction`s.
    fn program(&self) -> Program {
        Program::new(&self.code())
    }

    /// Get the program hash digest.
    fn hash(&self) -> Digest {
        self.program().hash()
    }

But the hash value never changes, as the implementations of ConsensusProgram has no dynamic parameters. So we can change the implementation of hash on a per-implementation level to be:

/// Cache for program hash, for faster claim production.
static PROGRAM_HASH: LazyLock<Digest> =
    LazyLock::new(|| TransactionIsValid.program().hash());

And then overwrite the default implementation of hash (or maybe even get entirely rid of it) to be:

impl ConsensusProgram for TransactionIsValid {
    fn hash(&self) -> Digest {
        *PROGRAM_HASH
    }

Credit for this approach goes to @aszepieniec who implemented it for the now-deleted consensus program src/models/blockchain/block/validity/transaction_is_valid.rs.