WebAssembly / tool-conventions

Conventions supporting interoperatibility between tools working with WebAssembly.
Artistic License 2.0
297 stars 65 forks source link

Add details on function pointers to BasicCABI #191

Open anuraaga opened 1 year ago

anuraaga commented 1 year ago

I found BasicCABI when looking for documentation on what conventions LLVM uses when compiling to Wasm. I noticed that it doesn't seem to include details on function pointers.

In my own guessing by looking at the details of a compiled result, I noticed that a Wasm binary has a single table of type=funcref, elements in the table with indexes to functions, and when a function is called with a function pointer argument, the index of the element in the table is used. One point I'd like to confirm is my assumption that funcrefs always have to be in the first table in the binary as there is no table index involved when invoking.

Would it be appropriate to add information on this to BasicCABI? I can try a PR if my understanding seems correct but it may be better for someone who implemented that in LLVM to do so.

dschuff commented 1 year ago

Yes, in MVP wasm (before the reftypes proposal), there can only be a single table, and its type is always funcref. Indirect calls implicitly reference table 0. With the reftypes proposal this restriction is lifted, but LLVM still uses table 0 for the function pointers. @pmatos is working on adding support for reference types, but that probably won't change this basic C ABI anytime soon. I think it does make sense to note that table 0 is used for indirect calls. @sbc100 is there any other table-related linker behavior that e.g. a toolchain trying to generate LLVM-compatible object files would need to know?

sbc100 commented 1 year ago

I agree it might make sense to add something like "function pointers are represented as integer indexes into a linker defined table called __indirect_function_table". The Linking.md documents talks about table symbols which can be used to define other user-defined tables, but functions pointers are always offsets into __indirect_function_table (normally table 0).

dschuff commented 1 year ago

Ah that's a good point that the linking document is relevant here this as well. Although I just noticed that while linking.md has sections about merging global, event, function, data, and custom sections, it doesn't actually talk about merging of table sections. Probably we should just add that there about what happens to tables at link time, and maybe add something to the C ABI about how function pointers work in C; maybe all we need to say is that C function pointers go into table 0, and are called with the call_indirect instruction? We could add that e.g. code or static initializers that want to materialize the pointer values need to use relocations, but the same is true for e.g. function indexes and memory addresses; the C ABI doc is mostly about C types, memory layout and calling convention, and it doesn't really cover how relocations are used by the C compiler in practice.

sbc100 commented 1 year ago

There is no support in the linker yet for merging tables, or even for tables with pre-existing elements. i.e. Object files can define tables, but only empty ones (at least today).

dschuff commented 1 year ago

oh, so the object file doesn't define a table at all, it only creates R_WASM_TABLE_INDEX relocs, and then anything targeted with such relocations automatically goes into the table after link?

sbc100 commented 1 year ago

Object files can define tables (with size, type, etc), they just can't define element segments for them.

sbc100 commented 1 year ago

R_WASM_TABLE_INDEX is only for the __indirect_function_table.. you can think of it as short hand for R_WASM_INDIRECT_FUNCTION_TABLE_INDEX.