EnzymeAD / Enzyme

High-performance automatic differentiation of LLVM and MLIR.
https://enzyme.mit.edu
Other
1.29k stars 109 forks source link

Failing to erase fictitious PHI nodes #2069

Open dime10 opened 2 months ago

dime10 commented 2 months ago

Context: I'm currently investigating some issues differentiating MLIR-generated LLVMIR with Enzyme. Our pipeline roughly consists of *Dialects -> LLVMDialect -> LLVMIR -> O2 -> Enzyme. While investigating, I decided to disable inlining of all function definitions (using the noinline attribute) during O2 in order to better trace what is happening where. However, this resulted in a new compilation error from Enzyme.

Issue:

 pp:   %_replacementA = phi ptr  of   %3 = tail call ptr @_mlir_memref_to_llvm_alloc() #0
Assertion failed: (pp->getNumUses() == 0), function eraseFictiousPHIs, file GradientUtils.cpp, line 8242.

I believe this error stems from Enzyme replacing allocation calls in the example IR with placeholder PHI instructions, but failing to delete them afterwards because of remaining uses of instruction result. I don't have a good understanding of the surrounding mechanism in Enzyme however.

Reproducer: I've used the llvm-reduce tool to generate a minimal example of the failure (ran both before and after the O2 pass), so it's possible the IR doesn't make much "sense" anymore, but hopefully it still demonstrates the problem in the same way. The original was much longer but I can provide that too if necessary.

@enzyme_dupnoneed = external constant i8

declare void @__enzyme_autodiff0(...)

declare ptr @_mlir_memref_to_llvm_alloc()

define void @my_model.cloned(ptr %0, ptr %1, i64 %2, i64 %3, i64 %4, i64 %5, i64 %6, ptr %7, ptr %8, i64 %9, i64 %10, i64 %11, i64 %12, i64 %13, ptr %14, ptr %15, i64 %16, ptr %17, ptr %18, i64 %19, i64 %20, i64 %21) {
  %23 = call { ptr, ptr, i64, [1 x i64], [1 x i64] } @_take(ptr null, i1 false)
  store i32 0, ptr %0, align 4
  ret void
}

define { ptr, ptr, i64, [3 x i64], [3 x i64] } @my_model.fullgrad1() {
  tail call void (...) @__enzyme_autodiff0(ptr nonnull @my_model.cloned, ptr null, ptr null, ptr null, ptr null, i64 0, i64 0, i64 0, i64 0, i64 0, ptr null, ptr null, ptr null, ptr null, i64 0, i64 0, i64 0, i64 0, i64 0, ptr null, ptr null, ptr null, ptr null, i64 0, ptr null, ptr null, ptr nonnull @enzyme_dupnoneed, ptr null, ptr null, i64 0, i64 0, i64 0)
  ret { ptr, ptr, i64, [3 x i64], [3 x i64] } zeroinitializer
}

define { ptr, ptr, i64, [1 x i64], [1 x i64] } @_take(ptr nocapture %0, i1 %1) {
  %3 = tail call ptr @_mlir_memref_to_llvm_alloc()
  %4 = tail call ptr @_mlir_memref_to_llvm_alloc()
  br i1 %1, label %.lr.ph, label %.lr.ph1.peel.next

.lr.ph1.peel.next:                                ; preds = %2
  %5 = ptrtoint ptr %4 to i64
  %6 = or i64 %5, 1
  %7 = inttoptr i64 %6 to ptr
  %8 = load double, ptr %7, align 8
  store double %8, ptr %0, align 8
  %.pre = load double, ptr %4, align 8
  store double %.pre, ptr %0, align 8
  ret { ptr, ptr, i64, [1 x i64], [1 x i64] } zeroinitializer

.lr.ph:                                           ; preds = %.lr.ph, %2
  %9 = load double, ptr %3, align 4
  store double %9, ptr %4, align 8
  br label %.lr.ph
}

Versions: llvm=3a8316216807d64a586b971f51695e23883331f7 enzyme=v0.0.130

wsmoses commented 1 month ago

Reducing a bit https://fwd.gymni.ch/tviqGy