EnzymeAD / rust

A rust fork to work towards Enzyme integration
https://www.rust-lang.org
Other
63 stars 9 forks source link

autodiff and rsmpi #137

Open jedbrown opened 1 month ago

jedbrown commented 1 month ago

This is based on this rsmpi example. After checking out that branch, please run

$ cargo mpirun -n 2 --example=dot_enzyme --release
[...]
source_id: DefId(0:13 ~ dot_enzyme[209b]::dot_parallel)
num_fnc_args: 5
input_activity.len(): 5
error: <unknown>:0:0: in function preprocess__ZN10dot_enzyme12dot_parallel17h48b7e665abd76ccfE double (ptr, ptr, i64, ptr, i64): Enzyme: cannot compute with global variable that doesn't have marked shadow global
@RSMPI_DOUBLE = external local_unnamed_addr global ptr

Note that RSMPI_DOUBLE is defined in mpi-sys/src/rsmpi.c as

const MPI_Datatype RSMPI_DOUBLE = MPI_DOUBLE;

so that the value can be used by Rust FFI bindings.

Debug profile

I have a different error with the debug profile.

$ cargo mpirun -n 2 --example=dot_enzyme
[...]
  %23 = call i32 @MPI_Allreduce(ptr %10, ptr %12, i32 %14, ptr %17, ptr %19, ptr %21) #95, !dbg !302974
  br label %25, !dbg !302975

24:                                               ; No predecessors!

25:                                               ; preds = %7
}

 call:   %14 = call i32 @MPI_Allreduce(ptr %7, ptr %8, i32 %9, ptr %11, ptr %12, ptr %13) #95, !dbg !2527
 unhandled mpi_allreduce op:   %12 = call ptr @"_ZN68_$LT$mpi..collective..SystemOperation$u20$as$u20$mpi..raw..AsRaw$GT$6as_raw17h490677fc1f8d0a6fE"(ptr align 8 %6) #95, !dbg !2525

Meta

rustc --version --verbose:

rustc 1.77.0-nightly (63fa87211 2024-07-16)
binary: rustc
commit-hash: 63fa87211c25d89e585d76e2a15724a600eff903
commit-date: 2024-07-16
host: x86_64-unknown-linux-gnu
release: 1.77.0-nightly
LLVM version: 17.0.6
ZuseZ4 commented 1 month ago

Is this something Enzyme core should understand, or what would be the fix here?

wsmoses commented 1 month ago

Yeah I think we should be able to just mark Enzyme internally as understanding this as equivalent to MPI double everywhere.

Is there a spec sheet for the ABI rust uses for MPI somewhere (probably @jedbrown)?

wsmoses commented 1 month ago

Ah reading Jed's original comment slightly more closely, I suppose the defns are all in https://github.com/rsmpi/rsmpi/blob/1cee1f9d8ad06b8085b98463b82a25494d98a513/mpi-sys/src/rsmpi.c#L3.

Yeah I think the resolution would be finding all uses of say "ompi_mpi_double" in the Enzyme code base and also adding RSMPI_DOUBLE and the like.

This particular issue will be resolved by adding it to https://github.com/EnzymeAD/Enzyme/blob/2f34f0b0d740980a0997a4871d8a8495cfed37c3/enzyme/Enzyme/ActivityAnalysis.cpp#L112 [or alternatively we could also give rust a similar attribute for inactive globals like in C++, but if this is the standard way of MPI in rust, we might as well handle this internally in Enzyme proper]

Also helpful to add to: https://github.com/EnzymeAD/Enzyme/blob/2f34f0b0d740980a0997a4871d8a8495cfed37c3/enzyme/Enzyme/TypeAnalysis/TypeAnalysis.cpp#L4783 https://github.com/EnzymeAD/Enzyme/blob/2f34f0b0d740980a0997a4871d8a8495cfed37c3/enzyme/Enzyme/AdjointGenerator.h#L148 https://github.com/EnzymeAD/Enzyme/blob/2f34f0b0d740980a0997a4871d8a8495cfed37c3/enzyme/Enzyme/Utils.cpp#L2346 https://github.com/EnzymeAD/Enzyme/blob/2f34f0b0d740980a0997a4871d8a8495cfed37c3/enzyme/Enzyme/MLIR/Analysis/ActivityAnalysis.cpp#L36

jedbrown commented 1 month ago

The symbols like RSMPI_DOUBLE exist because MPI_DOUBLE is a macro so we can't otherwise use it from Rust. With a user-defined operation, it would not map to a static value. I suppose Enzyme may not support user-defined MPI_Ops, but if that's desired in the roadmap, it would be necessary to intercept/augment MPI_Op_create and keep a run-time table to the derivative of the user's operation. When I pass the op in as Const to the differentiated function, I'm still getting the error pointing back to RSMPI_DOUBLE, which I don't really understand except for inlining.

wsmoses commented 1 month ago

The double issue is that Enzyme is being conservative and thinking the global variable itself is active, somehwere. Rather than risk wrong answers, throwing an error.

You can tell Enzyme to assume all globals are inactive: https://enzyme.mit.edu/getting_started/UsingEnzyme/#assume-inactivity-of-unmarked-globals but we should just mark this in the places listed above.

ZuseZ4 commented 1 month ago

bug2.ll.txt Unfortunately the reduction operator is passed in as pointer. Outside of the function being differentiated, this pointer got created by

%210 = load ptr, ptr @RSMPI_SUM, align 8, !noundef !801
%236 = invoke noundef double @_ZN10dot_enzyme12dot_parallel17h7dfcd86d9e8c176bE(ptr noalias noundef nonnull readonly align 8 dereferenceable(16) %25, ptr noalias noundef nonnull readonly align 8 %218, i64 noundef %219, ptr noalias noundef nonnull readonly align 8 %227, i64 noundef %228, ptr noundef %210)

The issue here is that we can't always see this when differentiating Enzyme, proof:

  307207 declare double @__enzyme_autodiff(...)
       1 
       2 define double @enzyme_opt_helper_0(ptr %0, ptr %1, i64 %2, ptr %3, i64 %4, ptr %5) {
       3   %7 = call double (...) @__enzyme_autodiff(ptr @_ZN10dot_enzyme12dot_parallel17h7dfcd86d9e8c176bE, metadata !"enzyme_const", ptr %0, metadata !"enzyme_dup", ptr %1, ptr          %1, metadata !"enzyme_const", i64 %2, metadata !"enzyme_dup", ptr %3, ptr %3, metadata !"enzyme_const", i64 %4, metadata !"enzyme_const", ptr %5)
       4   ret double %7
       5 }

       ; Function Attrs: noinline nonlazybind sanitize_hwaddress uwtable
  8117   define internal noundef "enzyme_type"="{[-1]:Float@double}" double @_ZN10dot_enzyme12dot_parallel17h7dfcd86d9e8c176bE(ptr noalias nocapture noundef readonly align 8 dere         ferenceable(16) "enzyme_type"="{[-1]:Pointer}" %0, ptr noalias nocapture noundef nonnull readonly align 8 "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" %1, i64 no         undef "enzyme_type"="{[-1]:Integer}" %2, ptr noalias nocapture noundef nonnull readonly align 8 "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" %3, i64 noundef "enz         yme_type"="{[-1]:Integer}" %4, ptr noundef "enzyme_type"="{[0]:Pointer}" %5) unnamed_addr #0 personality ptr @rust_eh_personality {
  ;; some lines
  %23 = call noundef i32 @MPI_Allreduce(ptr noundef nonnull %8, ptr noundef nonnull %7, i32 noundef 1, ptr noundef %10, ptr noundef %5, ptr noundef %22), !noalias !990
  ;;;

which gives the following error

error: <unknown>:0:0: in function preprocess__ZN10dot_enzyme12dot_parallel17h7dfcd86d9e8c176bE double (ptr, ptr, i64, ptr, i64, ptr): Enzyme:  call:   %23 = call noundef i32 @MPI_Allreduce(ptr noundef nonnull %8, ptr noundef nonnull %7, i32 noundef 1, ptr noundef %10, ptr noundef %5, ptr noundef %22) #38, !noalias !65
 unhandled mpi_allreduce op: ptr %5

Not sure how we can fix that in general? However, if we ignore my artificial autodiff testcase and focus on the real world one where we read from MPI_SUM, that reminds me of reading from an enzyme_const global, which enzyme already recognizes, so it should be doable to copy that in case that a load from a recognized ptr name dominates the call sites?

wsmoses commented 1 month ago

You may be able to do something for MPI reductions which is similar to how we handle generic indirect function calls. Specifically we define the shadow of a function to be the corresponding data [e.g. a pointer to augmented forward and reverse functions], and then if it is an indirect function use the shadow for the relevant derivative code.

This also would enable generic user defined ops which would be nice.

Alternatively you could look at ways to inline/pass the info from caller to callee like we do already for aliasing/etc

ZuseZ4 commented 1 month ago

The first one is too far in the Enzyme core weeds, I have no clue how to write that code, so I'll leave it to someone else.

If you have a link for the last option on where we do that already I might be able to copy that over for this case:

Alternatively you could look at ways to inline/pass the info from caller to callee like we do already for aliasing/etc

That should be good enough for this specific case.