WebAssembly / tool-conventions

Conventions supporting interoperatibility between tools working with WebAssembly.
Artistic License 2.0
298 stars 65 forks source link

Proposal: Add new relocation type R_WASM_MEMORY_ADDR_SELFREL_I32 #162

Open kateinoigakukun opened 3 years ago

kateinoigakukun commented 3 years ago

Summary

This R_WASM_MEMORY_ADDR_SELFREL_I32 relocation represents an offset between its relocating address and the symbol address. It's very similar to R_X86_64_PC32 but restricted to be used for only data segments.

S + A - P

A: Represents the addend used to compute the value of the relocatable field. P: Represents the place of the storage unit being relocated. S: Represents the value of the symbol whose index resides in the relocation entry.

Motivation

Currently, Swift uses R_X86_64_PC32 relocation in data sections on ELF. The relocation is used to express relative pointer which is used in Swift’s metadata. The relative pointer is a pointer that has a 32 bit offset from its own address to the referent’s address. This technique reduces the pointer size 64 bit to 32 bit on a 64bit addressing mode, and also reduces the load-time relocation on general environments.

Wasm doesn’t have a relocation type like R_X86_64_PC32 and doesn’t have any benefit of using the relative pointer on wasm32 because it doesn’t reduce pointer size and also Wasm doesn’t have load-time relocations. So we, SwiftWasm project, use 32-bit absolute pointers instead of relative pointers now.

However, using absolute pointers instead of relative pointers breaks the metadata layout on wasm64 due to the pointer size. If we can use relative pointers as same as other architectures, we can support wasm64 without changing the data structure. The same data structure will make it simple to debug and implement the compiler pipeline. So we propose a new relocation type for data segments to use relative pointers.

What do you think about this feature? @sunfishcode @sbc100

sbc100 commented 3 years ago

Interesting. Sounds like there are some real benefits for your use case. I have a couple of questions:

  1. I the primary advantage here only present on wasm64? In which case we should loop in @aardappel.
  2. Would the relocation type still exist on wasm32? for convenience only? It seems like it would be strictly less performant on wasm32 because you would need to do a runtime addition, and an absolute relocation would work find here?
  3. I suppose the relocation would fail if the if the relocation location and the symbol are more then 4Gb apart?
aardappel commented 3 years ago

I am generally a fan of pointer-relative addressing, so this seems like a nice option to have, especially for wasm64.

There may be an advantage for wasm32 in that data using these relocs can be placed/moved anywhere in wasm memory, as opposed to be baked by wasm-ld for a certain destination.

On the downside, there is no way to express these relative pointers in C/C++ or (currently) in .wat/.s, so its application is limited to compilers who do their own object file emitting?

Also, most systems that would emit relative pointer data would do so locally, meaning they don't need relocs at all. These relocs are thus limited to pointer relative data that is using external linker symbols, which makes it more niche.

Fantasizing beyond your use case, what would be fun is a SLEB version of this, the idea being that it can make data really small (with offsets possibly fitting in a single byte). But that is even more niche :)

sbc100 commented 3 years ago

Fantasizing beyond your use case, what would be fun is a SLEB version of this, the idea being that it can make data really small (with offsets possibly fitting in a single byte). But that is even more niche :)

Because we are only talking about the data section here.. so wouldn't using LEB require user-space LEB decoding to resolve such an address at runtime? Or perhaps you are saying we could embed relative data addresses in i32.const instructions maybe?

aardappel commented 3 years ago

@sbc100 yes, that would require the user to adopt SLEB for their data as well, hence the "fantasizing" part. Ignore that :)

kateinoigakukun commented 3 years ago

@sbc100 @aardappel Thank you for taking a look!

  1. I the primary advantage here only present on wasm64? In which case we should loop in @aardappel.

Yes, I'm planning to use this for wasm64 mainly.

  1. Would the relocation type still exist on wasm32? for convenience only? It seems like it would be strictly less performant on wasm32 because you would need to do a runtime addition, and an absolute relocation would work find here?

In the aspect of runtime performance, the relative pointer is less performant on wasm32, but it's a little worth using consistent addressing mode with other architectures for simplicity. So I'll implement this on wasm32 also.

  1. I suppose the relocation would fail if the if the relocation location and the symbol are more then 4Gb apart?

As well as R_X86_64_PC32, the relocation would fail when the difference between symbols are out of i32.

On the downside, there is no way to express these relative pointers in C/C++ or (currently) in .wat/.s, so its application is limited to compilers who do their own object file emitting? Also, most systems that would emit relative pointer data would do so locally, meaning they don't need relocs at all. These relocs are thus limited to pointer relative data that is using external linker symbols, which makes it more niche.

That's right. The only way to express relative pointers is LLVM IR or machine code directly now. And the relocation is used only for reference of foreign segment symbols. Perhaps, the reloc may be used for PIC addressing without __memory_base (of course, the relative pointer's absolute pointer must be known.) But I'm not sure there are such use cases 🤷‍♂️

kateinoigakukun commented 3 years ago

@sbc100 @aardappel I created a patch for this reloc type. Could you take a look when you have time? https://reviews.llvm.org/D96659

MaxDesiatov commented 2 years ago

This has been merged to LLVM as R_WASM_MEMORY_ADDR_LOCREL_I32. Does this mean that the functionality is fully implemented now? Do we need to submit a PR here clarifying how it works, and then close this issue?

dschuff commented 2 years ago

Yes, I believe that's correct. I think we just need to add to the existing list of relocation types