0xPolygonMiden / compiler

Compiler from MidenIR to Miden Assembly
MIT License
59 stars 20 forks source link

Implement support for indirect calls via dynexec/procref #32

Open bitwalker opened 10 months ago

bitwalker commented 10 months ago

Now that 0xPolygonMiden/miden-vm#1078 is merged (which introduces dynexec), the groundwork for supporting indirect calls in Miden Assembly is there. Before we can make use of it though, we need the proposed procref instruction, which would push the hash of a specified function name on the stack. This is done prior to dynexec, which requires that hash in order to execute the indirect call. We need procref because we don't know the hash of the function until after compilation, which procref solves by relying on the Miden assembler to expand it into push.HASH as it knows the hashes of all the functions.

Once that is implemented, we can make use of it to implement indirect calls in the IR:

### Tasks
- [ ] Add a new `call.indirect` opcode, to be paired with the `PrimOp` instruction type
- [ ] Add a new `Type` variant, `Type::Function(Box<FunctionType>)` which will be used in conjunction with `Type::Ptr` to represent function pointers
- [ ] Add a new data segment to `Program` which represents every function in the compiled program as a table of hashes. Initialize this table at runtime using `procref`
- [ ] https://github.com/0xPolygonMiden/compiler/issues/133
- [ ] Lower indirect calls to MASM by treating the function pointer operand as an index into the function table, load the callee hash, and then dispatch via `dynexec`. Indexing beyond the bounds of the table should trap.
bobbinth commented 10 months ago

It might also be good to wait for https://github.com/0xPolygonMiden/miden-vm/issues/1091 - but, of course, not necessary.

bitwalker commented 10 months ago

Agreed, I think it is beneficial to wait for those changes, though we can at least lay the groundwork for indirect calls without supporting them in codegen. If it turns out that we reach a point where we have procref but not the changes proposed in that issue, we can experiment with what's there now, but I don't anticipate doing that unless indirect calls turn out to be more important than initially thought when compiling from Rust.

greenhat commented 6 months ago

@bitwalker While translating the basic wallet Wasm component example PR I found a case where table is used as a parameter for module instantiation and I want to make sure we capture these semantics in the IR.

The gist of it is that the table is defined (but not initialized) and used in module A in call_indirect ops and this table is exported from module A. Then, module B which has a table import is supplied with module A's exported table as an argument on instantiation and initializes it with references to its own (module B) imported functions (component imports). So, call_indirect op in module A is calling functions from module B's imports.

A table argument (other module's export) to a module instantiation for the module's import is concatenated on module allocation to the module's defined(local) tables. All tables in Wasm are stored in a global store (see module instantiation) which acts as a global state (see store).

I believe we can capture these semantics in the IR by storing tables in IR Component and using their references in whatever modules they are used in after resolving all import/export relations. However, since we support the core Wasm module compilation without the Wasm component model, we need to decide how we store tables in this case. Now, Wasm core module is translated into an IR Module. I think for the sake of unification, we could wrap such IR Module with an empty Component and store tables in it.