EnzymeAD / oxide-enzyme

Enzyme integration into Rust. Experimental, do not use.
Apache License 2.0
102 stars 4 forks source link

Add documentation #6

Open ZuseZ4 opened 2 years ago

ZuseZ4 commented 2 years ago

To manage expectations, we should document what is expected to work already, what we will possibly fix and what is probably not going to work till we have finished a better Enzyme integration. All issues should be solved in the next iteration. This overview should be moved into a real documentation..

Not Working, unlikely to be fixed:

Likely to be fixed

Fixed

Workarounds:

strasdat commented 2 years ago

Using oxide-enzyme in a dependency, rather than in your current main project.

@ZuseZ4 - this sounds indeed like a significant limitation. Can you share a little background about this? What would it take to get this enabled? I'd assume some substantial changes to cargo, right?

ZuseZ4 commented 2 years ago

Sure @strasdat

So our main issue is that the cargo team discussed post-build.rs scripts, which would run after the compilation. They rejected them, because they did want to keep cargo focused and not turn it into a full cmake alternative. It seemed very unlikely to me, that they are going to reconsider that just for this project. So all we have are build.rs files, which will run before compilation. That's unfortunate for us, since Enzyme requires llvm-bc or llvm-ir files which are just generated towards the end of the compilation process. Their is no official way to register some function running after that. There are a few solutions out there from people with related issues, but all have their drawbacks. This is the drawback of my solution. So the cargo enzyme command currently doesn't do much except of calling

RUSTFLAGS="--emit=llvm-bc" cargo +enzyme -Z build-std rustc --target x86_64-unknown-linux-gnu -- --emit=llvm-bc -g -C opt-level=3 -Zno-link 

followed by

RUSTFLAGS="--emit=llvm-bc" cargo +enzyme -Z build-std rustc --target x86_64-unknown-linux-gnu -- --emit=llvm-bc -g -C opt-level=3

Notice the -Zno-link in the first run. This is necessary, since Enzyme didn't had a chance to create the functions yet. Not using no-link would result in compilation failure, since cargo would be missing the definition for those functions.

Inside of my library I'm doing some simple checks to see whether this is the first, or the second compilation run. If it's the first run I just return from my build script and let cargo do it's compilation. If it is the second compilation run, I look for all *.bc files, run llvm-link on them, read the merged.bc file and run enzyme on it. After some symbol magic I create an archive which just contains the function generated by enzyme. Afterwards I hand over the compilation process to cargo again and just ask it to link the new archive. When compiling your crate, cargo will first download and compile all of your dependencies. You can tell cargo to compile your dependencies with some extra flags. There is however no way to tell enzyme to compile some (or any) of your dependencies twice.

Some alternatives we considered (and dropped) include: 1) Calling cargo inside of our build-script to manually take over the compilation of dependencies. Cargo places a lock, so you can't run cargo in the same location because it would deadlock. Don't ask me how I know. 2) Automatically copying dependencies using Enzyme into a tmp dir. There I can spawn a cargo process to compile them there (twice) and move the relevant artifacts back, hoping that the main cargo process will pick them up. 3) Call rustc from our build file (because it doesn't create a lock, it won't interfere with cargo). Along the way re-implement cargo to solve dependency chains and such things.

If you happen to know a better alternative to this setup I'd be happy to switch. I should however note that issues around the c-abi are at least similar severe (in my opinion) and then we also use incomplete debug output (-g flag) to estimate how the memory layout of Rust types looks like. With the current implementation we have no clear path to solve any of these three issues, so we are currently working on a pre-rfc to discuss an alternative implementation.