hyperlane-xyz / hyperlane-monorepo

The home for Hyperlane core contracts, sdk packages, and other infrastructure
https://hyperlane.xyz
Other
316 stars 350 forks source link

Sealevel - settle on a design for indexing dispatched message & IGP payments #2205

Closed tkporter closed 1 year ago

tkporter commented 1 year ago

Figure out how the relayer should index messages & IGP payments. Atm, historical transactions are looked at one-by-one and the logs are looked at. This feels prohibitively expensive

tkporter commented 1 year ago

The current indexing implementation requires processing each slot, looking for blocks in these slots, looking at the transactions in these blocks, and looking at the instructions in those transactions. This would involve a lot of RPCs to sync from the Mailbox’s deployment, which is required by relayers to know which messages to relay & to reconstruct the merkle tree locally.

For what it’s worth, Wormhole indexing (when not using websockets) works exactly the same way (see code here). It also seems to require a specific RPC node setup: see node flags here. When they are using Websockets, they just use the programSubscribe RPC (see code here). From what I can tell, programSubscribe doesn’t seem to offer any guarantees around never missing an event or streaming events from a particular point in the past, so it doesn’t feel suitable for our purposes. Does that sound right? Interestingly, even though they do look at each and every transaction, they still suggest writing messages to an account!

For some context around RPC costs (assuming typical Solana RPCs do provide the features needed), here are some rough costs in terms of Quicknode API credits:

So assuming we still need some kind of polling but don’t want to inspect each and every transaction when indexing, I’m wondering if there may be a nice way to use a mix of PDAs and the getProgramAccounts (docs) RPC. The ideal behavior would be to have each message be written to its own account, and then in the relayer we can just poll for new accounts and read these messages. Because nonces are guaranteed to start at 0 and increment one by one, and because the nonce is always present in an encoded message at byte offset 1 (encoded format ​​here), the relayer logic could be something like: Starting from nonce 0, poll the getProgramAccounts RPC for accounts owned by the Mailbox program ID that have the specific nonce in their data at byte offset 1 (which iiuc can be done efficiently https://solanacookbook.com/guides/get-program-accounts.html)

The problem with this alone is that someone could create whatever account they want with whatever message they want and then assign ownership to the Mailbox contract ID. So there’d need to be a way off-chain to ensure that the Mailbox was the one that created the account, and not just that it’s the owner.

If I understand correctly, this is where PDAs could come into play, where the ID of a PDA is dependent on the program ID that created it, the seed, and the bump. So if the account ID where the message is written to as data can be confirmed off-chain as a PDA created by the Mailbox Program, then all is good.

But then there’s the part of deciding what the seed should be for the PDA. This is my thought process and where I landed as a possibly reasonable approach, which is #4:

  1. It’d feel natural for the seed to be something like message-${nonce}. But if I understand correctly, when the transaction that dispatches the message is created, the PDA where the message will be stored needs to be passed into the list of input accounts. This means that the client before sending their transaction that dispatches would need to commit to the nonce of the message. This opens up race conditions / denial of service between transactions seeking to send messages at the send time – e.g. Alice and Bob could see the next nonce to use is n, they both sign a transaction with the PDA with seed message-n, but Alice’s tx lands first and then Bob’s reverts because message-n has already been created
  2. I had a similar idea as above but with the PDA seed message-id-${message_id} - but this has the same problem because the message ID is dependent on the nonce. Relying on a hash that doesn’t commit to the message’s nonce wouldn’t work because then two messages with the same sender/recipient/body will have the same hash without the nonce adding replay protection
  3. You could try reducing the contention for PDA seeds by having a more granular counter or something, like a per-sender counter, this still has contention over something that can be frontran, resulting in a denial of service
  4. So my final idea is to base the PDA seed off something sorta random and that cannot be frontran by others. If you imagine adding a parameter to the Dispatch instruction like pda_seed_salt: &[u8] that callers are supposed to supply themselves, where the PDA seed would be like message-with-salt–${pda_seed_salt}, this still isn’t sufficient because a malicious person could frontrun the tx with the same pda_seed_salt and cause a denial of service. But what we could do (iiuc) is to have this PDA seed salt be a random account’s public key, where we require this account to have is_signer as true. So transactions that call the Mailbox will need to generate a random new Keypair, and they sign the tx with this Keypair (it’s not the transaction payer or anything). The public key would create the PDA seed, which would be something like message-with-salt-${salt_account.key}. The client would also need to find this PDA and pass it into the list of accounts on the transaction. And I think the pubkey of this random Keypair would need to be written to the data so that the PDA seed can be confirmed off-chain as correct Notable downsides to this is just increased complexity & it requires passing more accounts as inputs into the transaction - which may actually be prohibitive. I also think we’d need something pretty similar for IGP payments that need to get indexed. So it could be really heavyweight Maybe there’s a way to get the same effect of a non-frontrunnable sorta-random seed salt? Like is the transaction ID available in the runtime?

Other ideas

  1. Another idea I had was to leverage the fact that the incremental merkle tree attests to messages. So with increased complexity in the relayer to index messages, there could be a way to differentiate two accounts owned by the Mailbox program ID that claim to have the same nonce. But practically I think this is wayyy too complex to build into our indexing logic
  2. I also considered an idea that I think was thrown around earlier where the mailbox has some PDAs that it tries to “fill up” with data, and once it hits the max PDA size (which I think is pretty low compared to the max account size), it moves onto the next PDA with a seed that has a counter in it that increments. I feel like this also opens up the same contention that could result in denial of service
tkporter commented 1 year ago

We're uncertain about the time complexity of getProgramAccounts where there are tons of accounts. Gonna look into this

Hyperlane-Cryptolutions commented 1 year ago

per last feedback on telegram;

Pagination: Since pagination can't be relied upon due to the need for basic RPCs, it won't be pursued as a potential solution.

Partitioning: The idea of storing message account pubkeys in a nonce-specific PDA seems like a feasible one. Concerning the potential for race conditions, we could mitigate this by introducing some form of sequential processing or queuing mechanism, which ensures that transactions are processed in the order they were received.

Nonce Ranges: I agree, we should store pubkeys of the message accounts to avoid dependency on getProgramAccounts. The method of partitioning could also be combined with nonce ranges, where we create a new PDA after a certain nonce range is filled up. This could help in better organizing messages and ensuring faster queries.

Limiting Account Creation: You're correct in your assumption that someone could create their own data account and assign ownership to the Mailbox’s public key. To prevent this, we could add an authentication mechanism to ensure only authorized entities can create accounts. Alternatively, we could have a whitelist of authorized creators and check against this list before allowing account creation.

Caching: It's good to hear that you agree with this. Efficient caching mechanisms can significantly reduce the load on the network and speed up queries, especially in a polling-based system.

Indexer Service: Indeed, this seems like a more long-term solution. We could start by shifting more complexity on-chain as a temporary measure, and then migrate to an indexer service when the volume of messages increases. This would provide a scalable solution while keeping the initial complexity relatively low.

tkporter commented 1 year ago

I'll consider this done w https://github.com/hyperlane-xyz/hyperlane-monorepo/pull/2273