Review actors code and adapt to simpler design

adlrocha commented 1 year ago

The current implementation of IPC actors follow the legacy (and more complex) design of the protocol. We need to adapt the implementation to @guy-goren's simpler design. My proposal is to do a 3-days/1-week sprint were @cryptoAtwill can walk @guy-goren through the code and @guy-goren can share the changes required to adapt the actors to his simpler design. This way we would have a first version of the actors ready for deployment with the new spec.

The outcome of the sprint should be a set of issues or changes required for our actors to implement the simpler version of the protocol (cross-net actors). This exercise can also be useful as a first internal audit of the code.

List of changes required

After a first pass through the code and the simpler design, here are the list of things that we should change/improve.

Single-level cross-net messages allowed

In the simplified model, the IPC client is responsible for the propagation of arbitrary cross-net messages between subnets. Actors won't be the ones handling this operation, on-chain. Consequently, the SendCross method for the IPC gateway will be disabled, and only fund and release will be exposed through the IPC gatway to send cross-net message. The only low-level primitive exposed by the gateway is to send a cross-net message to the parent of a subnet or to an immediate child.

Triggered by externally owned addresses

With this, if a user wants to send a cross-net message from subnet A1 (where it holds some funds) to call an actor at address f01 in subnetB1 through the following path: A1 -> /root -> B1, this won't be handled automatically anymore by sendCross of the IPC protocol but by the IPC client. The client will decompose in a set of message that funds an account for the user in B1 to perform the call. It will be decomposed into a release message from A1 to root a fund message from root to B1, and the actor call within B1. This is possible because the on-chain ID for f1EOA in every subnet can be known through the mapping between users key pairs and their account actor.

This approach removes the issue of providing feedback about the execution of a message by the actors, e.g. if the user doesn't provide enough funds to the message to be execute and runs out of gas half-way, the IPC client will notify that, and prompt the user to fund the address where the message failed to keep propagating the funds to the target subnet for the actor call.

To illustrate the example graphically, sendCross(from: A1:f1EOA, to: B1:f01, method: X, params: blob, amount: gas_provision) translates into:

1. release(from: A1:f1EOA, to: root:f1EOA, amount: gas_provision) 
2. fund(from: root:f1EOA, to: B1:f1EOA,  amount: gas_provision -  release_fee)
3. send(from: B1:f1EOA, to: B1:f01, method: X, params: blob)

Thus, release and fund are exclusively used to move funds up and down the hierarchy. The EOA is the one responsible for paying for all the gas for the cross-net propagation and execution.

Triggered by actors.

Not only EOA can trigger cross-net messages, they may also be triggered as a side-effect of an actor execution in a subnet. For the propagation of these cross-net actors, we will include a new data structure in the IPC gateway state, postbox: THamt<Address, TCid<THamt<TCid<Msg>, Msg>> and two new method wrapped_cross_msg and propagate:

wrapped_cross_msg can be called by actor to send cross-net messages to some other actor in another subnet of the hierarchy.
postbox keeps track for an EOA of all the cross-net messages triggered by an actor that need to be propagated further through the hierarchy. As described above, messages can only be propagated one subnet at a time, either to the parent or to its immediate parent.

Note: We should consider introducing a set of semantics (and maybe modify the data structure for the postbox) in order for actors to be able to specify a list of addresses with "propagation rights" for the cross-net message. A "wildcard, i.e. *, can also be considered to notify that anyone can pay and orchestrate the propagation of the cross-net message. This would require some further analysis to see what are the implications.
propagate needs to be called by the EOA that triggered the cross-net message from the original actor to move the cross-net from the postbox up or down.

Lets illustrate the set of low-level messages involved here with a sample message from A1:f1EOA to actor A1:f01 that triggers a call to wrapped_cross_message(owner: f1EOA, msg: msg(from: A1:f01, to: B1:f02, method: X, params: blob)) from f01 as a side-effect.

// initial message to actor that triggers side-effect.
1. msg(from: A1:f1EOA, to: A1f01, ...)  
// message called by the actor to the ipc-gateway as a side-effect. The wrapped message
// is included in the postbox for further propagation with `f1EOA` as the owner.
   side-effect > wrapped_cross_message(owner: f1EOA, amound: provision_gas, msg: msg(from: A1:f01, to: B1:f02, amount: A, method: X, params: blob))
// f1EOA sends a message to `propagate` in IPC gateway of A1 to propagate the wrapped message
// to the root from the postbox.
2. propagate(amount: provision_gas: params: msg_cid)   // A1 subnet
// f1EOA sends a message to `propagate` in IPC gateway of root to propagate the wrapped message
// down to B1 from the postbox
3. propagate(amount: provision_gas: params: msg_cid)   // root
// f1EOA triggers the application of the cross-message
4. apply_msg(amount: provision_gas: params: msg_cid)  // B1
   // the application of the message triggers as a side-effect the call to the destination actor in the destination subnet.
   side-effect > send(msg(from: A1:f01, to: B1:f02, amount: A, method: X, params: blob)))

The EOA that triggers the initial side-effect in the originator actor is the one responsible for the propagation of the message to its destination, and for paying for the gas of routing the message. Actors can't sign messages by themselves, and the only way they have to trust that the message that triggered the forwarding of a cross-net message has been finalized in the source subnet, is if it is propagated though all of the consensus engines of the subnets to its path to the destination.

Note: From an implementation perspective, this means that we are introducing a new stage before automatically committing a message for propagation in commit_top_down and commit_bottom_up where an EOA needs to send a propagate message to pay for the gas and trigger the propagation of the message to the next subnet.

Why does IPC need general information passing, why not using bridges?

This new scheme introduces the ability for actors in IPC to send general information (and not only tokens) between any two subnets in the hierarchy. Why not building ad-hoc bridges or a network of oracles and relayers that directly forward information from one subnet to the other without having to traverse the whole path between the source subnet and the destination?

Each application deployed over IPC may have different security requirement and trust assumptions. If we provided a protocol orthogonal from the one used to move tokens through the hierarchy, we would introduce new security and trust guarantees. There are several ways in which bridges and oracles can be implemented, and if we provided a specific protocol for this we would be setting the baseline for users. By introducing the ability to pass general information into the mechanics of IPC, we are framing the security and trust assumption of information passing in the same of IPC token transfers, even if it introduces an unnecessary overhead. The rationale behind this is the following: "if one is OK with the trust and security guarantees of IPC for transferring tokens through the hierarchy, then it should be OK to also use this approach to send information". Developers are then free to deploy application-specific oracles and relayers on top of our protocol to circumvent the limitations of information passing in IPC, but at least we give an out-of-the-box mechanism to pass information between subnets.

Finally, the were some proposals to deploy ad-hoc bridges between two subnets when information want to be passed between them, circumventing the hierarchy. We can't predict how will actors interact with the rest of hierarchy, and deploying these bridges would require a 1:1 communication between (potentially) every two subnets in the system.

Execution of cross-net messages using user-defined messages.

Note: For IPC M2 we are going to stick the execution of cross-net messages implicitly (see discussion below for additional context why).

In the MVP, the execution of cross-net messages was performed through an ApplyImplicitMessage from the system_actor. This was possible because the IPC gateway was a builtin actor that could get special treatment and all peers shipped with the necessary logic to handle these special messages (the same way that rewards and cron events are handled implicitly). With the IPC gateway as a user-defined actor this is no longer possible.

For the execution of cross-net messages in the IPC gateway as a user-defined actor we will rely on getting a quorum of validators to accept the execution of the messages. In this new version, apply_msg accepts as parameters a batch of messages to be applied. When validators see that there are unverified cross-net messages to be applied, they perform all the consensus checks to determine that is final in the source subnet, and vote for their application by sending a message to apply_msg (which can only be called by validators). The gateway waits for every message to be "voted for execution" (i.e. that is final and the consensus checks have been successful) by a majority of validators in the subnet. We can then introduce a reward that is distributed to validators from the tokens provisioned by the user to pay for propagation and execution fees.

Illustrating it step-by-step the execution of cross-net messages works as follows:

Before (IPC gateway as builtin-actor):
1. Cross-net message pool (unverified)
2. Cross-net messages are proposed in a block.
3. To accept messages as part of the block validators perform cross-net consensus checks.
4. Cross-net messages within a block are executed implicitly (the same way block reward messages and cron events are executed)

After (IPC gateway as user-defined actor):
1.Cross-net message pool (unverified) 
2. Each validator perform cross-net consensus checks 
3. Each validator calls apply_msg 
4. When consensus reached the message is executed and rewards distributed.

:eyes: Even if each validator calls apply_msg the ipc gateway doesn't have a way of checking that the caller is a validator (this information lives in the parent). This is why we went with the implicit execution of messages in the first place (this way validators can perform the consensus checks before the execution). This is an issue that we will need to tackle if we end up not being allowed to execute messages implicitly. This is not the case for IPC M2, so we will release M2 with the implicit execution and revisit this in the future if needed.

Gas fees for checkpoints and cross-net primitives

Something that we didn't address as part of the MVP is who pays for gas in checkpoints and in the propagation of cross-net messages. With the new model it is easier to reason about who has to pay for what, and the distribution of the cost structure for the different primitives.

For cross-net messages, every time a fund, release or send_wrapped_msg is triggered, a cross_fee needs to be paid by the initiator of the transaction. This cross_fee is charged directly by these methods in the IPC actor and is added to a "validators balance". The "validators balance" is then used to pay for checkpoint commitment and to distribute rewards among validators.

Note: Initially will be set cross_fee to a constant. In the future we should maybe consider a dynamic fee calculation.
Initially, the reward distribution for checkpoint commitment will be as follows: every time a checkpoint is committed. the funds in the validator_balance will be distributed between all validators. All validators receive a base_reward. Validators that provided a vote to commit the checkpoint will see their reward increased by a check_reward (this will incentivize validators to rise to vote for a checkpoint). The bigger the fee, the more cross-net primitives the checkpoint propagates, and the increment should be. These rewards are computed as:
```
base_reward = ([top_down/total_cross_net_msgs]*cross_net_msgs)/num_validators
check_reward = ([bottom_up/total_cross_net_msgs]*cross_net_msgs)/validators_signed_checkpoint 
```
The reason for this distribution is that while all validators are responsible for the commitment of top-down messages, the only ones supporting and dedicating resources for bottom-up messages are those validators that signed the checkpoint.

cryptoAtwill commented 1 year ago

Hi @adlrocha @guy-goren , just two quick questions to clarify on the terminologies.

By the term IPC client, I assume it's code other than gateway and subnet-actor? Also, who is maintaining the IPC client, how do we know IPC client will behave correctly?
What's EOA?

adlrocha commented 1 year ago

By the term IPC client, I assume it's code other than gateway and subnet-actor? Also, who is maintaining the IPC client, how do we know IPC client will behave correctly?

The IPC client is the peer implementation, the off-chain code that orchestrates all the interaction with the actors and other peers of the network (let me know if it still not clear). The architecture in this issue should make it a bit clearer.

What's EOA?

Externally Owned Address. I should have stated this explicitly, sorry.

adlrocha commented 1 year ago

Closed by #30

consensus-shipyard / ipc-actors