Enumerate the data that the indexer service needs to provide

str4d commented 1 month ago

This then defines what data it needs to consume from a full validator (e.g. zebrad).

str4d commented 1 month ago

Two main sources of information about "what the indexing service needs to provide to wallets":

zcashd's CValidationInterface
The lightwalletd gPRC service.

str4d commented 1 month ago

Ideally the indexer is doing as little work as possible, but it should be doing everything that a full validator does not need to do for consensus purposes.

In the case where the full validator (zebrad), indexer, and scanner are running on the same hardware (but potentially in different processes), then it makes sense for the indexer to expose as "data" references into e.g. zebrad's RocksDB that the scanner can directly read (instead of the indexer parsing that data itself, re-encoding for passing to the scanner, and the scanner then parsing it anyway).

However, zebrad's RocksDB only stores the stable block range, so the indexer access API cannot solely be RocksDB; there has to be an in-memory component. It is not desirable to require the scanner to maintain that in-memory component, because that means every scanner has to sync complete full block data for the non-stable range, and build the same index locally. The purpose of the indexer service is to build this index in one place from purely-public information, and then cache it for efficiently serving to scanners as and when they need it.

cc @arya2.

pacu commented 1 month ago

Today in LCWG we shortly debated about what indices are needed for a Zcash Block Explorer to be backed by Zebrad.

that's another application we need to take into account

conradoplg commented 1 month ago

However, zebrad's RocksDB only stores the stable block range, so the indexer access API cannot solely be RocksDB; there has to be an in-memory component. It is not desirable to require the scanner to maintain that in-memory component, because that means every scanner has to sync complete full block data for the non-stable range, and build the same index locally. The purpose of the indexer service is to build this index in one place from purely-public information, and then cache it for efficiently serving to scanners as and when they need it.

(Mostly writing this down for myself and whoever is interested, after looking it up)

This is basically what the zebra-state crate does: it allows querying things by looking up both finalized and non-finalized state. However, it also responsible to build and maintaining the state while also indexing it. Thus, while it makes sense to have an indepenent Indexer, I think it will be too much work to split from the rest of zebra-state and it would be best to keep it as is for now.

(Disclaimer: I haven't been working on Zebra for a while and I might be getting some things wrong)

conradoplg commented 1 week ago

I looked a bit into this and listed the data required by the lightwalletd server. Zebra should already have all of them since it works as a lightwalletd backend:

Blocks
- latest (GetLatestBlock)
- by height or block hash (GetBlock, GetBlockRange)
Nullifiers
- by height or block hash (GetBlockNullifiers, GetBlockRangeNullifiers)
Transactions
- by hash (GetTransaction)
- by address and block range (GetTaddressTxids)
Balances
- by addresses (GetTaddressBalance, GetTaddressBalanceStream)
Mempool Transactions (GetMempoolTx)
- excluded with TXID filter (GetMempoolTx)
Note Commitment Trees
- by block ID (GetTreeState)
- latest (GetLatestTreeState)
Subtree Roots
- by index (GetSubtreeRoots)
UTXOs (address, txid, index, script, value, height)
- by addresses and start height (GetAddressUtxos, GetAddressUtxosStream)
I started looking into CValidationInterface and I see that it's basically a mechanism to notify wallets of events, so the indexer should also provide some notification service? I see that @arya2 listed them in https://github.com/ZcashFoundation/zebra/issues/8610 so I'll double check that.

zcash / librustzcash

Enumerate the data that the indexer service needs to provide #1395