FractalFir / rustc_codegen_clr

This rust compiler backend(module) emmits valid CIL (.NET IR), enabling you to use Rust in .NET projects.
MIT License
1.39k stars 30 forks source link

Support for SelfContainedDeployment [SCD] for executable projects #29

Open admalledd opened 8 months ago

admalledd commented 8 months ago

As part of back and forth on a recent reddit thread where you got cargo integration (sort of) working by wrapping a shell script to dotnet run %foo%.exe the real assembly, I had pondered about a future state where if the rust project is a executable you by-default run the Self Contained Deployment MSBuild tasks/target(s) to weave the final assembly into an actual ELF or PE.

I am not entirely sure where/what step this would take place at, nor off-hand the exact magical incantations desired to re-weave an existing CLR assembly to say ELF via MSBuild (or other dotnet API), but this would solve a few of the integration issues of this project into normal cargo flows. Note that lib and dylib etc projects could remain plain CLR assemblies, this would just be a thing for anything executable (or testable, etc).

I intended to poke-with-a-stick this idea in a few months, but also thought at least writing this down in case you or someone else gets inspired to try it out. I may be able to provide guidance here-and-there, I am far more familiar with the CLR/MSBuild side of the tooling world than rustc/cargo.

admalledd commented 8 months ago

Darn it, this was itching my brain too much for what/where the key MSBuild/CLR function to abuse/learn from is since I've had to deal with it before, and is basically the starting point for "you have a assembly from ILASM, now you want an executable, be it self-contained, R2R, or framework dependant, here is where it is weaved into a template". So I dug that back up to be roughly here: https://github.com/dotnet/runtime/blob/main/src/installer/managed/Microsoft.NET.HostModel/AppHost/HostWriter.cs#L36

That (in theory) if you can find the Apphost Template for self-contained runtime, you should be able to weave/find-replace the magic string with the desired raw assembly binary content, re-copy all the resources, set execute bits etc as that function call does and in theory there you go. In theory.

Is it a goal to not require a CLR/Dotnet SDK installed? Or can we depend on a valid (modern aka net-core 6+) SDK on the developers machine/CI/build agents etc? If we can depend on existing dotnet SDK, whipping up/abusing deep calls into either the HostModel tools or preferably custom MSBuild target(s) to then do the final weaving may not be too hard? And would give advantage of all the ready-2-run AoT etc tooling (even though quite silly for Rust!)

FractalFir commented 8 months ago

The project currently expects there to be at least some parts of the .NET SDK(ilasm), but in the future, I would like to not have any external dependencies. Still, support for SCD will probably be disabled by default anyway (since it makes build times longer), so depending on a .NET SDK for SCD should not be an issue.

SCD could be done by the projects "linker". Besides joining intermediate files, it is also responsible for assembling the final executable using ilasm, so it could invoke MSBuild too. This way, SCD could be enabled by passing in a linker flag (something like --deploy-self-contained)

AOT could be beneficial. Currently, the project emits only pure-IL assemblies. This could be changed in the far future, but until then, enabling AOT could yield some serious performance improvements. AOT could also improve testing: due to the lazy nature of the JIT (methods are compiled as needed), a method with invalid CIL can slip through all the tests, if it is not invoked (ilverfy is out of question too since it can't check assemblies for correctness without also verifying their safety). AOT will compile all the methods, ensuring no invalid one is present in the final executable.

admalledd commented 8 months ago

On requiring the DotNet SDK: that is fair and makes sense, though wouldn't that mean a user could compile but then not run the code? Hrm, that would be more akin to cross-compilation/wasm/etc so maybe that is fair. Was just mostly thinking that relying on the SDK could mean far less custom code on your side especially with linking/packaging/weaving/etc the final assembly. Those while not easy to do outside MSBuild targets, are certainly doable.

On SCD: My thought was to consider always doing so (for executable crates/targets), and having the foo_app and foo_app.exe both exist. Cargo and such tooling can run the SCD foo_app as they expect, and foo_app.exe still exists as normal raw assembly. My understanding is that cargo generally expects an executable crate to be directly executable (hence your environment tricks) for cargo run foo_app scenarios. Cargo does more-or-less static-like linking by default and we could do the same thusly. So as to remove the need for those environment variable tricks and script to wrap dotnet run foo_app.exe and such. Or do you have a better/different longer term plan on the rust/cargo side?

Though a challenge with any of these paths is either auto-installing/extracting the few specific SDK tools (all the AppHost bundles to weave into, CrossGen2 and AoT tools, etc) or just has to require a compatible SDK be installed.

On AoT/R2R: oh that is a thought and trick, using that to force the CLR to fully parse every single thing of your CIL and (kinda) validate it, at least far more than ilverify truly does. That might be an easier starting point for someone interested in the depths of manually writing a MSBuild+CrossGen+R2R target file to load and execute the required tasks. AoT is sadly (nearly) impossible to call/execute without the MSBuild's task help. I can gather a few notes on that in the coming week or two. Should I dump those here or open a different issue for tracking this abuse of AoT? I approve of all this fun :)

kant2002 commented 8 months ago

Why do you think AOT is nearly impossible without MSbuild tasks? Currently it is mostly gathering required dependencies and passing it to ILC which do the magic. If you ever want, you can call it yourself. If you take a look at AOT targets in SDK there no task for ILc itself, only for finding runtime packages, but that’s anyway required for all packaging options.

my point is that AOT approximately same complexity as other targets if codegen is appropriate

admalledd commented 8 months ago

AOT: Not impossible, just exceedingly annoying since in theory you will also likely want to pass it to R2R first, to build out all the arguments/parameters as required. My understanding is that the arguments are not stable SDK version to SDK version for either, and the only officially-unofficially not-supported but recommended-if-you-have-to method is building the arguments lists via MSBuild tasks+targets.

hez2010 commented 7 months ago

My understanding is that the arguments are not stable SDK version to SDK version for either, and the only officially-unofficially not-supported but recommended-if-you-have-to method is building the arguments lists via MSBuild tasks+targets

The arguments are stable. We have been using R2R and NativeAOT on godbolt (compiler explorer) to generate binary for disasm, it never breaks since the day one we introduced it (since .NET 6). And we are passing arguments to the compiler directly without using MSBuild. What you only need to do is to call ilc --help (NativeAOT) or crossgen2 --help (R2R) to check what arguments are there, then you are able to use it. Basically, you only need to pass that path of your entrypoint assembly and all the assemblies you referenced, then it will produce obj files for you so that you can link them to an executable file. Here you can save your time by passing all assemblies in the SDK (runtime package) as references so that you don't need to figure out which one is needed by yourself, because the AOT toolchain will only generate code for assemblies which are really referenced in code.