bytecodealliance / wasmtime

A lightweight WebAssembly runtime that is fast, secure, and standards-compliant
https://wasmtime.dev/
Apache License 2.0
15.49k stars 1.31k forks source link

varargs support #1030

Open sunfishcode opened 6 years ago

sunfishcode commented 6 years ago

In order to compile C code that calls printf in libc, we need to implement the caller side of varargs. And in order to compile printf itself, we need to implement the callee side.

stoklund commented 6 years ago

I am not 100% sure that we actually want too support this. It's not really a goal of Cretonne to be a C compiler nor to support all of LLVM's semantics. I don't think varargs support is needed to be a code generator for the Rust compiler, and WebAssembly doesn't require it either.

sunfishcode commented 6 years ago

What if we didn't do all of the va_arg, llvm.va_start, etc. stuff in Cretonne, and just had some way to specify how much varargs argument space is needed for a caller, and ways to get a pointer to the varargs area for the caller and callee. Producers would do the rest of the lowering manually. This doesn't actually seem like it would be drastically different from constructs Cretonne already has.

Also, for completeness, it looks like varargs will eventually be needed for Rust; see https://github.com/rust-lang/rfcs/pull/2137. Though it's not yet implemented even for the LLVM backend, so we won't have to worry about it for a while.

stoklund commented 6 years ago

Hmm, the Rust varargs callee use case seems even more obscure than inline assembly. I don't think we should commit to a design for this until we have a more clear need for it.

It's not simple to do varargs correctly. Some ABIs require the caller to place arguments in multiple locations. SPARC even requires some floating point values to be passed in three different locations under some circumstances. Intel avoids that by instead requiring crazy code in the callee for dealing with floating point arguments in registers.

If we have to do it, so be it, but a half-hearted attempt is just going to be buggy like LLVM's va_arg instruction.

sunfishcode commented 6 years ago

I agree. I believe all of your stated concerns above could be addressed by the design I sketched here. I also agree that we don't have to do anything about this right now.

sunfishcode commented 6 years ago

Unfortunately for us, rustc is making progress on implementing variadic functions: https://github.com/rust-lang/rust/pull/49878.

dlrobertson commented 6 years ago

If we have to do it, so be it, but a half-hearted attempt is just going to be buggy like LLVM's va_arg instruction.

The implementation proposed that I'm working on uses the LLVM va_arg instruction under the hood, so things like aggregate types will not be supported.

bjorn3 commented 6 years ago

Calling variadic functions is necessary for libstd: https://github.com/bjorn3/rustc_codegen_cranelift/issues/146 (not very urgent for me, might be able to just panic at runtime)

dlrobertson commented 6 years ago

Which architectures and OSes would need to be supported?

sunfishcode commented 6 years ago

Calling variadic functions is certainly easier than the callee side. On many platforms it's the same as non-vararg. On x86-64 System V for example, everything is the same as non-vararg, except that you also need to set the %al register to the number of floating-point/SIMD arguments being passed. In theory, you can do that in Cranelift today by adding an extra function argument and setting the location field of its AbiParam to something like this:

    let rax = isa.register_info().parse_regunit("rax");
    ArgumentLoc::Reg(rax)

So what we need to do is to make sure that works, and package it into a friendlier interface. My rough idea is to put it into something like make_variadic_signature and make_variadic_call functions in cranelift-frontend, which would append the extra needed arguements when needed, so that we limit the spread of C vararg semantics in the rest of the codebase.

On the callee side, Rust is still working on implementing this using LLVM, so there's not a lot of Rust code using it yet :-).

sunfishcode commented 6 years ago

Oh, one other difference between vararg and non-vararg on the caller side in C is that float is promoted to double. I assume we can let frontends take care of that.

bjorn3 commented 6 years ago

In theory, you can do that in Cranelift today by adding an extra function argument and setting the location field of its AbiParam to something like this:

How would that work with cranelift-module? It would throw IncompatibleDeclaration, when calling a vararg function with different argument counts/types.

sunfishcode commented 6 years ago

You'd declare the function with the extra argument too; that's the make_variadic_signature part.

bjorn3 commented 5 years ago

Does https://github.com/bjorn3/rustc_codegen_cranelift/commit/53e51acc2fb7782ead73e8e18a9b5d3241235555 seem like a good way to support this? (I know I forgot setting %al)

dlrobertson commented 5 years ago

@bjorn3 I don't know much about rustc_codegen_cranelift, but I would think that support for calling variadic functions would need to be added to cranelift itself instead of rustc_codegen_cranelift.

IMO there are three things that need solving (one I have still yet to do in rustc for LLVM):

The last one is still WIP see https://github.com/rust-lang/rust/pull/57760

Is this issue only focused on a subset of the above three? Also note, depending on how the last two are supported in rustc_codegen_llvm and rustc_codegen_ssa, the effort needed to support cranelift for the callee side may be minimal.

bjorn3 commented 5 years ago

but I would think that support for calling variadic functions would need to be added to cranelift itself instead of rustc_codegen_cranelift.

Cranelift is more low level than LLVM. Eg it doesn't support structs, but expects you to implement them yourself. Implementing the call side of variadic functions can be done as a producer of cranelift ir. It is however possible to implement it in cranelift in the future.

Is this issue only focused on a subset of the above three?

This issue is about both.

Also note, depending on how the last two are supported in rustc_codegen_llvm and rustc_codegen_ssa, the effort needed to support cranelift for the callee side may be minimal.

rustc_codegen_llvm depends on LLVM to implement it and rustc_codegen_ssa just delegates to the codegen backend using it.

dlrobertson commented 5 years ago

rustc_codegen_ssa just delegates to the codegen backend using it

Exactly, I'm specifically referring to the implementation of va_arg here. The current manual implementation of va_arg for pointer variants found here just uses the BuilderMethods trait which the backend provides. So in theory va_arg.rs could be moved to rustc_codegen_ssa meaning the implementations of va_arg found there could be used by both back-ends. In addition, there is https://github.com/rust-lang/rust/issues/56489 which would also obviously implement va_arg for either backend.

jyn514 commented 5 years ago

I'm not sure I see how a frontend could implement varargs manually. How would I implement a calling convention without access to registers? e.g. for floating point %al needs to be set: https://stackoverflow.com/a/2538212/7669110

bjorn3 commented 5 years ago
  1. for sysv you can ignore varargsness when calling it (except for %al) just pretend that it always has the exact types and number of values you pass.
  2. you can manually assign a value to a register (not sure if legalization or something else overwrites it/breaks though)
jyn514 commented 5 years ago

for sysv you can ignore varargsness when calling it (except for %al) just pretend that it always has the exact types and number of values you pass.

That doesn't work if you pass different numbers of args in the same program, e.g.

int printf(const char *format, ...);

int main() {
    printf("hello world\n");
    printf("some int here: %d", 2);
}
bjorn3 commented 5 years ago

What I did for cg_clif is dont tell cranelift_module about the varargs and then use declare_func_in_func once per vararg call. That will give a unique FuncRef every time. I then changed the associates Signature (which is also unique) to match the varargs.

jyn514 commented 5 years ago

and Cranelift allows that?? I can try it out, I'm surprised that cranelift allows you to modify the signature though haha

bjorn3 commented 5 years ago

Yes, the Function you build is competely mutable. You can rewrite any part you want.

jyn514 commented 5 years ago

How do I set registers?

I found RegDiversions::reg to get the current register of a Value and InstBuilder::regmove to move from one register to another, but I don't know how to get a RegDiversions instance (RegDiversions::new looks like it won't have any of the existing registers). My code so far looks like this:

// float_variadic is an integer, builder is a FunctionBuilder
let float_ir = builder.ins().iconst(types::I8, float_variadic);
let target = self.module.isa();
if target.name() != "x86" {
    unimplemented!("variadic args for architectures other than x86");
}
let reginfo = target.register_info();
let al = reginfo.parse_regunit("al").expect("al register should exist on x86");
let diversions: RegDiversions = unimplemented!(); // how do I get this?
let locations: ValueLocations = unimplemented!(); // how do I get this?
let current_reg = diversions.reg(float_ir, &locations);
builder.ins().regmove(float_ir, current_reg, al);
bjorn3 commented 5 years ago

RegDiversions is only used when regalloc is running or has already ran. For your usecase you will want to use AbiParam::special_reg in the Signature. You can get the RegUnit by using isa.register_info().parse_regunit("rax") for x86_64. I dont think passing al to parse_regunit will work, so I chose rax in this example.

Jakersnell commented 7 months ago

Are there any updates with this? Implementing varargs as a frontend with Cranelift is extremely cumbersome.

bjorn3 commented 7 months ago

There hasn't been any change since the last comment here. Also do you need just support for calling variadic functions or also defining them? The former is a lot easier to support than the latter.

Jakersnell commented 7 months ago

Just calling them.