zeta-chain / node

ZetaChain’s blockchain node and an observer validator client
https://zetachain.com
MIT License
169 stars 108 forks source link

separation of voting and execution of external observation #1515

Open brewmaster012 opened 10 months ago

brewmaster012 commented 10 months ago

Currently, the voting msgs of external observations from observers are tied up with execution. The vote msg that happens to be at the threshold of 2/3 happen to execute the observation and triggers next steps in the cross-chain transaction.

This model has the following short term and longer term problems:

  1. Short term: Each voting msg must carry a gas limit that can execute the observation, even though only 1 such msg may actually execute the observation. This is a problem because executing the observation can cost up to 4.5M gas and this limits the number of observation throughput severely.

Separating the voting and execution will have the following additional benefits

  1. Potentially simplifying observability in piecing together the chain of sub-transactions in a cross-chain transaction.
  2. Make it easier to do observer accounting and incentives;
  3. Paves way to transition/augmenting observation with merkle proof

Potential problems of separating voting and execution:

  1. Changing some of the core logic in the cross-chain transaction and also a bit of zetaclient

Impact: This change can be made with minimum impact on users, dApps, node operators. It will have some minor impact on some of node 1317 RPC, and explorer and analytics/monitoring services such as the auditor service.

lumtis commented 10 months ago

Does it have a further scope compared to https://github.com/zeta-chain/node/pull/1511?

brewmaster012 commented 10 months ago

Does it have a further scope compared to #1511?

Seems the same. The only thing that I see right now is that we'd probably want this in v12. The limitation on observation throughput might be severe.

lumtis commented 10 months ago

Does it have a further scope compared to #1511?

Seems the same. The only thing that I see right now is that we'd probably want this in v12. The limitation on observation throughput might be severe.

I added the Epic in v12, this is why the low level tasks were removed https://github.com/zeta-chain/node/issues/1512

The limitation on observation throughput might be severe.

Not sure to get this, the refactoring in the PR doesn't improve the throughput

brewmaster012 commented 10 months ago

Does it have a further scope compared to #1511?

Seems the same. The only thing that I see right now is that we'd probably want this in v12. The limitation on observation throughput might be severe.

I added the Epic in v12, this is why the low level tasks were removed #1512

The limitation on observation throughput might be severe.

Not sure to get this, the refactoring in the PR doesn't improve the throughput

The idea is that once the separation is done, then observation txs (+ execution msg) consume less gasLimit. For example, currently observation inTx has gasLimit 4.5M, block gas limit is 500M then we are limited by around 100 observation txs, which translates into around 10 observation executions (assuming threshold is 10 votes).

After separation, then observation txs gas limit can be reduced to let's say 300K, execution tx gas limit still 4.5M (but we just need one per exeuction). Now we can support around 100 executions instead of 10.

lumtis commented 10 months ago

The change should not be complex on the protocol, however it looks non-trivial on ZetaClient. We currently parse the events from block a -> b and send vote for each of those and we don't have post-processing for the vote tx, should consider it has been successful. Created this issue regarding this: https://github.com/zeta-chain/node/issues/1520

If we include a separate message for execution we must change this logic in the sense we need to consider voted ballot but not yet executed, also what would be the observer responsible for execution? It might not be optimized if each observer send the exec message once ballot finalized.

One solution could be to consider a new option: execute bool in the message, excluded from digest, if it is false, then the message will fail if ballot is finalized during this vote. So all observer send the message with execution==false and low gas limit originally, one of them will fail because the ballot should be finalized, this observer will send again the message with execution==true and high gas limit for the actual execution of the cctx.

lumtis commented 10 months ago

Actually I'm thinking that we don't even need the execute option. We just specify the 300K gas limit and if the tx fails because of gas exceeded, then we resend the tx with 4.5M gas limit has it would imply this is the vote that finalize the ballot

kingpinXD commented 10 months ago

Alternatively , we can use begin block to auto-execute completed transactions. From the implementation perspective we can create a new index for cctx indexes which are ready to be executed and clear the list once execution is done.

lumtis commented 10 months ago

Alternatively , we can use begin block to auto-execute completed transactions. From the implementation perspective we can create a new index for cctx indexes which are ready to be executed and clear the list once execution is done.

It looks an interest solution we could explore. There some risks we need to consider about having logic that can be triggered by external party into the end blocker.

kingpinXD commented 10 months ago

Most ballot-based transactions might benefit from having a separation in voting and execution. It might make sense to implement this more broadly so that it can be used for all ballot-based transactions and not just CCTXs. It would also reduce the zeta-client gas usage a fair bit for things like gas_price_voter.

Implementation-wise, I am thinking of the voter functions like inbound, outbound, and gas_price, only voting and providing the inputs for the various required structs, cctx and gasPrice in this case, and having a separate execution function triggered at every block, which checks the finalized ballots, and calls the execution function using the ballot type. Implementation can be fleshed out more, as the call to the execution function would need different inputs based on the type of ballot it needs to finalize.