Closed skalogryz closed 1 year ago
Looks like it's not supported. I added an error message for now, so at least it won't crash: https://github.com/WebAssembly/wabt/pull/1202.
ok, thanks... any plans for the implementation?
For sure we should implement this. Out of interest what is your use case for the -r
flag.. I added while I was developing the wasm object format but I'm curious to know if its actually useful in wabt?
but I'm curious to know if its actually useful in wabt?
i'm adding wasm target to freepascal compiler (https://github.com/skalogryz/freepascal/tree/webasm) (https://lists.freepascal.org/pipermail/fpc-pascal/2019-September/056848.html)
The compiler internals make it hard to use tree-structure code of binaryen's wasm-as
. That makes wat2wasm
the best option so far.
the pascal is modular. Each compiled wasm object file needs to be linked in the end.
(static) linking requires the use of -r
flag. (The relocation information generated by the flag worked more or less well for lld
, with a little help of an additional tool for some cases)
as for call_indirect
specifically. It's (supposed to be) used to implement function pointer calls and virtual methods.
I see. So your compiler outputs .wat
, then you use wat2wasm -r
to make your object files, then you link your object files with lld
.
Sounds like you are the first real user of wat2wasm -r
and as such you may run into more difficulties down the line. For example, there is no way that I know of to include data symbols or data relocations in the .wat
format.
@aardappel recently added support for his lobster language, and as part of the he made a language agnostive C++ library (a header-only library I believe) that can be used to directly generate relocatable wasm object files. @aardappel would this a good use for your library? If you use that you wouldn't need wabt at all I think.
For example, there is no way that I know of to include data symbols or data relocations in the
.wat
format.
so far, automatically generated symbols worked fine.
f you use that you wouldn't need wabt at all I think.
a disclaimer to use wabt? :)
Maybe you don't need data relocations in pascal? But this kind of relocation is currently not representable in wat format (at global scope):
int bar = 1;
int foo = &bar;
In a C toolchain this would generate 8 bytes of static data, the with a relocation (pointing to the symbol bar) in the second 4 bytes.
I'm happy to continue to support and improve -r
support :) . I just worry that full support will require at least some amount of work on the wat format.
@sbc100 looks like the FreePascal compiler is itself written in Pascal, so wholesale adoption of my C++ library is unlikely to work :)
@skalogryz that said, if you want to learn how to output linkable wasm object files that work with LLD, this C++ header is a small amount of self-contained code that writes such a binary, in particular see the function Finish
at the end for the linking related code. More documentation on that file.
@aardappel thanks. I've actually already started a binary utility myself (https://github.com/skalogryz/wasmbin). As it was mentioned earlier there's no much over the symbol information through the .wat
file, so I had to post process the data with the tool after the compilation. (that did help to prevent linking conflicts / errors)
FPC does allow support of internal object file writers. With wasm object format being relatively simple I was hoping to avoid going that path. (adding a compiler target is not the same as maintaining the target). It's more fun to use the new (wasm) features (i.e. threads, exceptions.. etc), rather than keeping implementing them (once they become the part of the standard) :)
But this kind of relocation is currently not representable in wat format (at global scope)
No, they don't. The problem is that effective address of bar
is not relocatable. Yet the variable, that could hold the effective address is.
(global $addrof_bar i32 (i32.const 0))
once linking is done, it's possible to "relocate" the effective address of 'bar' by changing the value of $addrof_bar constant (i.e. i32.const 0
changes to i32.const 2342
). (changing the value means - changing it in binary file). I wouldn't expect wat2wasm to do that though.
on the topic of call_indirect
relocation.
per this document: https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md#merging-function-sections
There are currently two ways in which function indices are stored in the code section: ...
- Immediate argument of the
i32.const
instruction (taking the address of a function).
I don't think that follows WebAssembly text format convention. (the only argument of i32.const
allowed is a numeric field and not function index).
looking at the wat2wasm
code, i didn't find any support for such feature either.
But this kind of relocation is currently not representable in wat format (at global scope)
No, they don't. The problem is that effective address of
bar
is not relocatable. Yet the variable, that could hold the effective address is.(global $addrof_bar i32 (i32.const 0))
once linking is done, it's possible to "relocate" the effective address of 'bar' by changing the value of $addrof_bar constant (i.e.i32.const 0
changes toi32.const 2342
). (changing the value means - changing it in binary file). I wouldn't expect wat2wasm to do that though.
But what you are describing is exactly what the linker is supposed to do. That is what relocations are for.
on the topic of
call_indirect
relocation.per this document: https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md#merging-function-sections
There are currently two ways in which function indices are stored in the code section: ...
- Immediate argument of the
i32.const
instruction (taking the address of a function).I don't think that follows WebAssembly text format convention. (the only argument of
i32.const
allowed is a numeric field and not function index). looking at thewat2wasm
code, i didn't find any support for such feature either.
Right, this is why the -r
flag isn't very useful today as a way to building linkable object files. llvm will generte such code correctly either from C/C++ source or you can write in the .s
assembly format something like:
i32.const $foo
The assembler then creates a relocation after the i32.const
and the linker resolves the relocation and puts foo
in the wasm table to make it indirectly callable.
In short, the wat
format as it stands today is not suitable as a general purpose assembly language. Its can't describe object files in the general case.
@sbc100 - I'm using WAT as a first-class assembly language for a fantasy console. There's a runtime with a memory-mapped API, but the user writes all of their code in WAT, and any of their modules can import whatever any other module exports. Do you foresee a problem with that?
The issue we are discussing here mostly refers to linking with other object files. if you are not using a linker then writing pure WAT is a lot more reasonable.
Once the linker gets involved you will, at a minimum, want to assign symbolic names to memory locations.
Thank you, @sbc100. Much appreciated.
on the topic of
call_indirect
relocation. per this document: https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md#merging-function-sectionsThere are currently two ways in which function indices are stored in the code section: ...
- Immediate argument of the
i32.const
instruction (taking the address of a function).I don't think that follows WebAssembly text format convention. (the only argument of
i32.const
allowed is a numeric field and not function index). looking at thewat2wasm
code, i didn't find any support for such feature either.Right, this is why the
-r
flag isn't very useful today as a way to building linkable object files. llvm will generte such code correctly either from C/C++ source or you can write in the.s
assembly format something like:i32.const $foo
The assembler then creates a relocation after the
i32.const
and the linker resolves the relocation and putsfoo
in the wasm table to make it indirectly callable.
Agreed that the wat format doesn't support this well. However, since we have the -r
flag currently, I wouldn't be opposed to adding functionality to the text format as long as it's not too invasive.
Relocations for call_indirect (and func types in general) are being implemented in #1525
@binji I'd be interested in implementing something like the i32.const $foo
syntax as well, but we probably also want the same for memory addresses of data sections in addition to function indices, but I'm not sure how those are relocated and how we can assign names to them.
@sammax unfortunately, we can't really add text syntax like that, since wabt tries to match the spec (and future proposals). The best we can hope for is using annotations for this, perhaps something like:
i32.const (@reloc foo) 0
Memory addresses would require a similar annotation syntax. The wat-numeric-values proposal will help here too.
Do you think a spec proposal to add named data segments and to resolve data segment names to their addresses would have any chance of being accepted?
Annotations would work too I guess, though I think we should keep them independent of relocations, since it would be useful to refer to memory locations by name even in non-relocated code.
Maybe a proposal like that would be accepted. I think the big question is what is the expected output from a module using this text format. If we're saying that it creates relocation entries in the linking sectio, then I think that may be a problem, since the linking section format is not standardized (at least as far as the spec is concerned). We could say that it is like an annotation and can be interpreted as desired by the tools. But that seems a little unsatisfying too.
Annotations would work too I guess, though I think we should keep them independent of relocations, since it would be useful to refer to memory locations by name even in non-relocated code.
It's a little tricky to make that work, since data segments don't require a fixed offset. For example, you can write:
(import "env" "g" (global $g i32))
(data (global.get $g) "my data ...")
If we extend the syntax to allow named memory locations, then there's no way for that to be a simple i32.const
. I suppose we could say that you can't refer to an address in that segment, though.
I had been thinking that in normal mode it would work the same as in normal assembly files, i.e. just resolve to the address, and in relocatable mode it would additionally add the relocations. But that doesn't really work with data segments that use a non-const offset. Not allowing referring to such segments might work, or maybe not allowing naming such segments. We could also introduce some kind of pseudo instruction data.addr $name
that gets replaced by the offset expression for that named segment, but I'm not really sure if that's a great idea.
Annotations have the same problem, though maybe it would be more palatable to have an annotation that gets transformed into an expression than a pseudo-instruction.
Doesn't this problem also exist for relocations though? How do offsets work when linking anyways? Are data segments with constant offsets relocated during linking or do they interfere with each other?
We could also introduce some kind of pseudo instruction data.addr $name that gets replaced by the offset expression for that named segment, but I'm not really sure if that's a great idea.
Agreed, I don't think we'll want pseudo-instructions in the wat format. I guess I can image something like this though:
(data (i32.const 0)
$label1
"some data"
$label2
"some more data"
...
)
...
(i32.const data-addr=$label2)
Doesn't this problem also exist for relocations though? How do offsets work when linking anyways? Are data segments with constant offsets relocated during linking or do they interfere with each other?
I'm not sure about how this works, @sbc100 would know.
Doesn't this problem also exist for relocations though? How do offsets work when linking anyways? Are data segments with constant offsets relocated during linking or do they interfere with each other?
The linker assumes that it has ultimate control over the offset of all data segments. That is, I think it completely ignores the offsets specified in the input files.
Since the linker decides the data layout, all data references require relocations.
@binji I like that syntax, I think we would also need table-elem=
or something like it for table elements, and a way to name table elements. If you think it's a good idea, I can get on writing a proposal for it (once I figure out how that works).
The linker assumes that it has ultimate control over the offset of all data segments. That is, I think it completely ignores the offsets specified in the input files.
@sbc100 That is the case for both const and global offsets, but not for passive ones, right?
I think we could have labels/names for global-offset segments in a way that makes sense in non-relocatable mode too by making them resolve to an offset relative to the segment:
(data (global.get 0)
$label1
"some data"
$label2
"some more data"
...
)
In this example, $label1
would resolve to 0
and $label2
would resolve to 9
, so that one could then write the following to get the address of "some more data"
:
(i32.add
(global.get 0)
(i32.const data-addr=$label2)))
This would only happen when the offset is a global, for const offsets the label would resolve to an absolute address calculated as <const offset of segment> + <offset of label within segment>
.
I'm not completely happy about having different semantics based on whether the offset is const or global, but I can't think of a better way to solve it and I believe it provides enough value to be worth having it.
@binji I like that syntax, I think we would also need table-elem= or something like it for table elements, and a way to name table elements. If you think it's a good idea, I can get on writing a proposal for it (once I figure out how that works).
Cool! Easiest way to start is to open an issue in WebAssembly/design with a little background, and a proposed syntax. Then we can discuss it a little bit there before bringing it to the group. This would be a similar kind of proposal to https://github.com/WebAssembly/design/issues/1348, since they both only affect the text format. So you could use that as inspiration.
Cool! Easiest way to start is to open an issue in WebAssembly/design with a little background, and a proposed syntax. Then we can discuss it a little bit there before bringing it to the group. This would be a similar kind of proposal to WebAssembly/design#1348, since they both only affect the text format. So you could use that as inspiration.
Thanks, I'll take a look at that! I think I should be able to write something up some time this week.
Fixed in #1525
The following code fails to compile by wat2wasm (v1.12) if relocatable binary (-r) is requested. Is it expected? (exit code 0xC0000005) Non-relocatable binary compiles just fine.