fubark / cyber

Fast and concurrent scripting.
https://cyberscript.dev
MIT License
1.14k stars 38 forks source link

C/C++ Backend #74

Open fubark opened 7 months ago

fubark commented 7 months ago

This is something that I won't get to for a long ... long time. But I see value in having a C/C++ backend as our AOT strategy.

This can be fun to contribute to if you want to learn about compilers and still generate code that can be useful by itself.

Some criteria:

Progress:

How to work on this:

ccleavinger commented 6 months ago

This proposal sounds amazing! I'd love to help with this sometime I just have a few questions/comments:

Sorry if this was a lot. I'm very excited by the potential Cyber has and would love to help in anyway available!

fubark commented 6 months ago
  • A lot of your criteria are really vague. For readable code do you want comments from the user to be inserted?

Not by default but that can be behind a flag if it can be reasonably done. The code it generates should look like what you might write yourself if you had to follow the same semantics as Cyber such as inserting retain/release ops, handling conversions from primitive to box values, handle exceptions via error codes (which is different from the VM), etc. Variable and function names should try to be very close to the original.

  • Should we try to target Nim's IR and call it a day?

Most definitely not. We'd want to own everything up to machine code generation to have as much flexibility as we can. Last thing we'd want is to implement X feature and realize it's not possible because of a dependency.

  • Will we generate header only files for easiest include and linking process?

Once we can generate the single .c file, splitting it into multiple units or a single header file shouldn't be too much trouble. These options can be behind a flag. In a default build, you wouldn't even see C files unless you actually wanted to.

  • How much backwards compatibility do we need? (GameDev industry is moving to C++23 fairly rapidly (source) )

For C++ the idea is to reuse a lot of the same C code, and only generate the shims (classes) so that it's ergonomic for the C++ side to call into.

  • How will this work with the embed API?

AOT is not really meant to be used for the embed API, because the host is usually the AOT program. But... we can reuse the C backend as another JIT method. Definitely slower to compile than the copy-patch JIT but it can trade the slower compilation speed for max performance.

  • What practices will the generated code adhere to?

See response about readability. The goal of the first iteration is to make sure it will work and getting tests to pass.

  • Will we have to write our own version of the cyber standard library in C/C++?

I don't think we will need to... it would be better if we write Cyber code to replace it (at least the non cross platform parts). In the meantime, we'll need to link with the static lib so that we can bootstrap (be able to run test asserts, etc).

  • How will FFI be handled?

Currently FFI is done at runtime. For AOT, we'll need to offer a way to statically/dynamically link libraries too (emitting externs).

  • Will we offer a C++ embed API b/c we will support C++ as a compile target?

I think this would just be a wrapper around the C API.

  • Should we keep the AOT backends limited to C/C++? Generating Rust, Zig, SPIR-V, or Odin would be pretty sweet.

Most systems languages offer similar semantics so it wouldn't be difficult to do (Rust would be harder). The hard part is making sure Cyber semantics can be accurately represented. After C/C++, I'd be more interested in LLVM or a Zig IR (if they become a competitor to LLVM). Cranelift is also an interesting choice being aligned with WASM. Of course, I wouldn't stop anyone from supporting more...

BTW, I've actually started on this as I wanted to see if it was possible and also cause it's interesting... I will push the commits up once I've gotten a good feel for it.

fubark commented 6 months ago

This is roughly what the generated code looks like for a test case:

// Other headers..

i48 main_fib(Fiber* f, i48);
extern Value builtins_typesym(Fiber*, Value*, uint8_t);
extern Value test_eq(Fiber*, Value*, uint8_t);
// Other externs...

int main() {
    pm.syms = cy_syms;
    mainFiber.pm = ±
    Fiber* f = &mainFiber;
    i48 res = main_fib(f, 6);
    Value tmp1 = TRY_PANIC(test_eq(f, (Value[]){BOX_INT48(res), BOX_INT(8)}, 2));
    Value tmp2 = TRY_PANIC(test_eq(f, (Value[]){TRY_PANIC(builtins_typesym(f, (Value[]){BOX_INT48(res)}, 1)), BOX_SYM(10)}, 2));
    return 0;
}

i48 main_fib(Fiber* f, i48 n) {
    if (n < 2) {
        return n;
    }
    return main_fib(f, n - 1) + main_fib(f, n - 2);
}