Generate Typed Erlang/Elixir/Gleam Definitions from Rust

SichangHe commented 6 days ago

So, what is preventing this?

It seems that we do have the source of truth from the Rust side. nif, NifStruct, and other Rustler macros have access to the typed raw arguments/fields; they just do not store that information in the Nif struct, yet. If we modify the macros to store those information, we would have them all in rustler::codegen_runtime::inventory::iter::<rustler::Nif>() when calling init!, right?

Code generation on the BEAM side would be more complicated, though. A few problems I can think of:

Where do we put the generated file?
Where to load the dylib from?
How to inject user configuration?
How to convert Rust types to BEAM types or TypeSpecs?
How to handle external types?

In #85, there seemed to be interest from @tsloughter and @lpil, but the rebar3_run and rebar3_cargo mentioned there seem to have staled. I guess it would be much better to just integrate most of the code generation into Rustler itself.

filmor commented 6 days ago

These are really separate subjects. Check out https://github.com/rusterlium/rustler/pull/614, that's the avenue I'm currently exploring. The best option to make the signatures survive is probably to add another separate exported C function at compile time.

dvic commented 6 days ago

I was literally just now searching for a solution to this problem 😄

I found this project, which might be interesting: https://github.com/zefchain/serde-reflection/tree/main/serde-generate

filmor commented 6 days ago

One problem that we have that serde doesn't have to deal with is that NIF libraries are meant to be loaded within the context of the BEAM and thus need to have all of the enif_* symbols defined. We thus can't just build a binary from the NIF library and have it "just work". What we /can/ do is fake all of the symbols s.t. linking goes through and then call a selected subset of functions to inspect the library. This is what I am doing now with the linked PR. It works for all NIF libraries (not just Rustler-generated ones).

Generating suitable signatures is a bit more tricky. We can extend the Encoder and Decoder traits to provide type signature information. A simpler way for now would be to just add an attribute to define the signature and expose it on a new nif_signatures function that we then try to load in the generator.

@bjorng Would specifying such a function via an EEP be interesting? I guess extending ErlNifFunc is at the very least more cumbersome to do.

SichangHe commented 6 days ago

@filmor, thanks a lot for the quick response!

One problem that we have that serde doesn't have to deal with is that NIF libraries are meant to be loaded within the context of the BEAM and thus need to have all of the enif_* symbols defined[…] This is what I am doing now with the linked PR. It works for all NIF libraries (not just Rustler-generated ones).

Sorry, I must admit that I am unfamiliar with this. So, you are saying you want to generate the BEAM-side definitions via inspecting the NIF dylibs, after compiling to them?

Generating suitable signatures is a bit more tricky. We can extend the Encoder and Decoder traits to provide type signature information.

My understanding is Rustler has that signature information during macro expansion, correct? Some thoughts I had for simpler ways to get that information include

letting the init! macro produce some side effects, e.g. write the signatures to a JSON file; or
bundling the BEAM-side definition generation directly into init!, via file writing.

These ideas seem simpler to implement except for the problems I mentioned earlier, though I am not sure what larger problems there are with these approaches.

filmor commented 6 days ago

I'm sorry, but you can't simultaneously say that you aren't familiar with the details of NIF libraries and then claim you found a "simpler way". Of course I thought about just injecting data into the library file. But you can not access this data without a PE/ELF/MachO binary reader (yes, Windows, Linux and macOS use different file formats) which I deem complete overkill for this exercise.

/edit: I also considered writing a file during build, but apart from build.rs shenanigans (and even for those I don't see a clear path) I don't think this is possible right now.

Also, no, Rustler doesn't (necessarily) have the full type information during macro expansion. The macro expansion (like in Elixir) runs on a "token stream", so just a little bit better than on bare text. We might be able to extract information at runtime(!) from this by extending Encoder and Decoder, which brings us back to having to load the library, which is what the tool in the referenced PR does.

SichangHe commented 5 days ago

Okay, I was wrong. I apologize. We don't have rustler::codegen_runtime::inventory::iter::<rustler::Nif> at compile time, only at run time. Talking about writing to a file at compile time was silly, and no wonder you are trying to bake the functionality into the generated dylib instead.

And… yes, proc macro side effects are unintended and build scripts are the ones for them instead. So, the Rust-token-based approach would be to make another dedicated CLI tool to extract function signatures from the Rust source, which would involve parsing each Rust file using syn[^frb], reusing rustler_codegen to gather NIF information, and outputting the results. The problem, as you have mentioned, would be the shallow type information.

Then, it indeed seems to be fewer troubles to inject the data into the dylib, and then loading it back out with your helper CLI.

[^frb]: https://github.com/fzyzcjy/flutter_rust_bridge/ does code generation like that.

filmor commented 5 days ago

Just writing down some notes:

We should also generate -opaques for all resource types
Maybe we can build the type specs using NIF enums with some convention on how to handle composites, eg struct Elem { is_or, is_and, type, count, Elem[count] }

SichangHe commented 5 days ago

[…] NIF libraries are meant to be loaded within the context of the BEAM and thus need to have all of the enif_* symbols defined. We thus can't just build a binary from the NIF library and have it "just work".

What if we can, @filmor? If we feature gate rustler-sys, we can get a library that only has the NIF information but not the functions themselves, then we can manipulate the Nifs freely in Rust.

Also, would there be any advantages manipulating the types on the BEAM instead of in Rust?

rusterlium / rustler

Generate Typed Erlang/Elixir/Gleam Definitions from Rust #628