Introduce a wasm-module-level actor constructor function

Stebalien commented 2 years ago

Currently, we "construct" actors by invoking method 1. However:

This makes it difficult for the system to pass parameters to the constructor (in addition to the constructor parameters). For example, we'll likely want to pass in the actor ID of the original caller.
We'd like to give users more flexibility in choosing their calling convention (possibly even removing method numbers). By treating the constructor as a "normal" method, we lock in the current calling convention.
Anyone can invoke the constructor, so actors need to check that it's actually being invoked by the init actor.

Proposal: Introduce a top-level (like invoke) create function that takes the calling actor ID and the construction parameters.

raulk commented 2 years ago

Sounds sensible.

We've also discussed introducing entrypoints for upgrade and migrate, although those are unclear yet.
Constructor = method_num=1 is part of a specific calling convention (Filecoin method number convention). We want constructors to be agnotic of the calling convention, as you noted in the description.

Anyone can invoke the constructor, so actors need to check that it's actually being invoked by the init actor.

Not sure if that's the case. The system routes actor messages to the invoke entrypoint of the actor. The construct entrypoint would not be invokable by anybody, just by the system. We'd need to decide how exactly we expose the construct entrypoint.

Why the departure from the "IPLD block for parameters" interface? The constructor can still call msg_caller() and fetch the parameters from the IPLD block.

raulk commented 2 years ago

Thinking out loud.

The create_actor syscall can't invoke the construct entrypoint directly, because there wouldn't be a message involved, so they'd be no way for the caller to limit gas or to send a value to the constructed actor.

Stebalien commented 2 years ago

Not sure if that's the case. The system routes actor messages to the invoke entrypoint of the actor. The construct entrypoint would not be invokable by anybody, just by the system. We'd need to decide how exactly we expose the construct entrypoint.

Ah, sorry, I mean currently. See https://github.com/filecoin-project/builtin-actors/blob/09f0ca19ac6c2b981ba183e97e622c5aaffba301/actors/miner/src/lib.rs#L132.

Why the departure from the "IPLD block for parameters" interface? The constructor can still call msg_caller() and fetch the parameters from the IPLD block.

I was thinking that the message caller would be the init actor. But, on second thought, if we have a special method here, we can just make the message caller the actor that called the init actor.

So yeah, there's no need to depart from the current system.

Stebalien commented 2 years ago

The create_actor syscall can't invoke the construct entrypoint directly, because there wouldn't be a message involved, so they'd be no way for the caller to limit gas or to send a value to the constructed actor.

Well, that could be passed into the create_actor syscall.

I was actually assuming that we'd have the following flow:

User actor sends a message to the init actor.
Init actor calls a privileged syscall to create the target actor. This privileged syscall sets up the invocation container, etc.

This is the off-chain flow, so I assumed we'd make the on-chain flow as well.

anorth commented 2 years ago

I'm down with trying to be agnostic about calling conventions, but we do need to call something here and so we need a convention about it.

To me, it would be nice to have the priviledged create_actor syscall do as little as possible, ideally not requiring gas metering, and then invoking the constructor as a subsequent message to the new actor (as we do now). But then (a) how do we indicate to the callee that we're calling the ctor, and how do we pass the Init actor's caller through, without a convention about how to do it?

I do think the separation of system-level creation from arbitrary state construction logic is valuable. It's simple, and follows a "there's only one way to do it" kind of rule w.r.t. sending messages. It'll be simpler to implement and reason through.

I'm not sure that avoidance of specifying a convention or purity around IPLD-block parameters is worth breaking that for. Since no-one else is going to be calling the constructor, it's not like other callers of the actor have to adhere to this convention, they can all just ignore method numbers or whatever the callee convention is. They just wouldn't be able to repurpose method number 1 for something else.

A possible, but clunky, convention for ctors could be a wrapper IPLD object that's constructed by the Init actors and passed as the parameter:

type ConstructorParams struct {
    Caller Address
    Params []byte
}

The callee would have to first decode the ConstructorParams, and then the Params (if they're CBOR).

So with the options we've thought about so far, this comes down to a choice between a construction convention based on normal message calls (with some clunkiness because we want to add a parameter from the system), or adding VM special handling and a new entry point for a different type of message call used for construction.

Stebalien commented 2 years ago

To make sure we're on the same page here:

Currently, actors export only a single wasm-level invoke method and actor-level method dispatch is handled internally.
The proposal is to add a second wasm-level create method for initialization.

The goal is to clearly distinguish between normal "sends" and actor initialization so that actors don't have to check things like "am I being constructed by the init actor, or has someone tried to re-initialize me".

To me, it would be nice to have the priviledged create_actor syscall do as little as possible, ideally not requiring gas metering, and then invoking the constructor as a subsequent message to the new actor (as we do now).

Personally, I'd rather have the "create" operation be atomic so we never have a state-tree with an uninitialized actor. What was the motivation to do very little in create_actor?

I'm not sure that avoidance of specifying a convention or purity around IPLD-block parameters is worth breaking that for.

It turns out we don't really need to do this. @raulk pointed out that, if this is a "special" operation, we can just make the caller syscall return the actor that invoked the init actor in the first place.

anorth commented 2 years ago

Yes I understand the proposal and am pushing against the additional complexity being embodied by the FVM. I'm not taking a hard stand – I can't say I love either approach, and you might decide to do it anyway.

What was the motivation to do very little in create_actor?

Reduce complexity inside the VM, which is complex and invites attack
Avoid introducing a syscall with unpredictable gas cost (from traversing/updating HAMT)
Avoid having two different ways to invoke actor methods, and all the machinery required to do that

Arguments I can see in favour of special monolithic create_actor that also invokes constructor:

Convenience to actors, which otherwise need code to check for caller, re-initialisation etc
Easy (but not simple) way to solve problems like knowing the caller of init
Kind of a privileged calling convention that avoids method numbers

anorth commented 2 years ago

Just dropping a note here that I'm much more neutral on this proposal now (rather than initially pushing back against it).

filecoin-project / ref-fvm

Introduce a wasm-module-level actor constructor function #746