EnzymeAD / Enzyme

High-performance automatic differentiation of LLVM and MLIR.
https://enzyme.mit.edu
Other
1.21k stars 100 forks source link

Bug in Enzyme gsl branch #1884

Closed davidedelvento closed 1 week ago

davidedelvento commented 2 months ago

@wsmoses , this is on the origin/gsl branch which I am attempting to verify with the following test case.

Note: this is likely to be "operator error" in that I am using Enzyme incorrectly, but I am still reporting it as a "bug" in that it should refuse to compile rather than cause a crash at runtime.

$ cat try_gsl.c
#include <gsl/gsl_sf_legendre.h>
#include <stdio.h>

void __enzyme_autodiff(void*, ...);
extern int enzyme_const, enzyme_dup, enzyme_out;

void compute(int d, double* costheta, double* LegendreBuf) {
        gsl_sf_legendre_array_e( GSL_SF_LEGENDRE_SPHARM, d, *costheta, 1, LegendreBuf );
}

int main(int argc, char** argv) {
        int d = 20;
        int size = gsl_sf_legendre_array_n(d);
        double * LegendreBuf =(double *) malloc( size * sizeof(double) );
        double costheta = .7;
        compute(d, &costheta, LegendreBuf);
        for(int i=0; i<5; i++) {
                printf("%d: %f\n", i, LegendreBuf[i]);
        }
        printf("...\n");
        for(int i=size-5; i<size; i++) {
                printf("%d: %f\n", i, LegendreBuf[i]);
        }
        double *gradient = (double*)malloc(size * sizeof(double));
        for (int i=0; i<size; i++) {
                LegendreBuf[i] = 0;
                gradient[i] = 0;
        }
        gradient[d/2+1]=1;
        double gradtheta = 0;

        __enzyme_autodiff((void*)compute,
                          enzyme_const, d,
                          enzyme_dup, &costheta, &gradtheta,
                          enzyme_dup, LegendreBuf, gradient
                          );

        for(int i=0; i<5; i++) {
                printf("%d: %f %f\n", i, LegendreBuf[i], gradient[i]);
        }
        printf("...\n");
        for(int i=size-5; i<size; i++) {
                printf("%d: %f %f\n", i, LegendreBuf[i], gradient[i]);
        }
        printf("%f\n", gradtheta);

        return 0;
}

following the footsteps of https://github.com/EnzymeAD/Enzyme/issues/792#issuecomment-1224701354

To compile, I use the following Makefile

ENZLLD   = /path/to/LLDEnzyme-18.so
ENZLLVM  =  /path/to/  /LLVMEnzyme-18.so
ENZCLANG =  /path/to/  /Enzyme/ClangEnzyme-18.so

try_gsl.exe: try_gsl.o
        clang -fuse-ld=lld try_gsl.o -o try_gsl.exe -lgsl -Wl,--load-pass-plugin=$(ENZLLD) #-g -Wl,--fplugin-arg-enzyme-enzyme-print

try_gsl.o: try_gsl.c
        clang -c try_gsl.c -g -flto -O3 -fpass-plugin=$(ENZCLANG) -Xclang -load -Xclang $(ENZCLANG) # -fplugin-arg-enzyme-enzyme-print-activity -fsanitize=address

debug: try_gsl.ll
        opt try_gsl.ll -load-pass-plugin=$(ENZLLVM) -passes=enzyme -enzyme-print -disable-output

try_gsl.ll: try_gsl.c
        clang try_gsl.c -S -emit-llvm -o try_gsl.ll -O2 -g

clean:
        -rm *.o
        -rm *.exe
        -rm *.ll

Running the test code prints everything, but segfaults (and hence finishes with exit code 139, despite the "return 0")

$ make clean && make && ./try_gsl.exe
rm *.o
rm *.exe
rm *.ll
rm: cannot remove '*.ll': No such file or directory
make: [Makefile:22: clean] Error 1 (ignored)
clang -c try_gsl.c -g -flto -O3 -fpass-plugin=/home/davide/repositories/ENZYME/Enzyme-v0.0.103/enzyme/build-gsl/Enzyme/ClangEnzyme-18.so -Xclang -load -Xclang /home/davide/repositories/ENZYME/Enzyme-v0.0.103/enzyme/build-gsl/Enzyme/ClangEnzyme-18.so # -fplugin-arg-enzyme-enzyme-print-activity -fsanitize=address
clang: warning: -Wl,--disable-new-dtags: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/home/sw/spack/spack-v0.22-inpreparation-b/opt/spack/linux-rhel8-icelake/gcc-8.5.0/llvm-18.1.1-q5ml52qgpewslx2e26irjh4g26eimfed/lib: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/home/sw/spack/spack-v0.22-inpreparation/opt/spack/linux-rhel8-icelake/clang-17.0.6/gsl-2.7.1-b3wxnooxvgcrmypmvj6zkbgclv3gc2ml/lib: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,--as-needed: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-L/home/sw/spack/spack-v0.22-inpreparation-b/opt/spack/linux-rhel8-icelake/gcc-8.5.0/llvm-18.1.1-q5ml52qgpewslx2e26irjh4g26eimfed/lib' [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-L/home/sw/spack/spack-v0.22-inpreparation/opt/spack/linux-rhel8-icelake/clang-17.0.6/gsl-2.7.1-b3wxnooxvgcrmypmvj6zkbgclv3gc2ml/lib' [-Wunused-command-line-argument]
clang -fuse-ld=lld try_gsl.o -o try_gsl.exe -lgsl -Wl,--load-pass-plugin=/home/davide/repositories/ENZYME/Enzyme-v0.0.103/enzyme/build-gsl/Enzyme/LLDEnzyme-18.so #-g -Wl,--fplugin-arg-enzyme-enzyme-print
0: 0.282095
1: 0.342022
2: 0.246732
3: 0.148234
4: 0.386197
...
268: 6.082763
269: 6.164414
270: 6.244998
271: 6.324555
272: 6.403124
0: 0.282095 0.000000
1: 0.342022 0.000000
2: 0.246732 0.000000
3: 0.148234 0.000000
4: 0.386197 0.000000
...
268: 6.082763 0.000000
269: 6.164414 0.000000
270: 6.244998 0.000000
271: 6.324555 0.000000
272: 6.403124 0.000000
2.323361
Segmentation fault (core dumped)

Removing the __enzyme_autodiff call makes the code work as expected (obviously without the wanted results :rofl: )

Running the print option with make debug provides the following, which I am not fluent enough yet to understand deeply enough to grasp what the problem might be.

$ make debug
clang try_gsl.c -S -emit-llvm -o try_gsl.ll -O2 -g
clang: warning: -Wl,--disable-new-dtags: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/home/sw/spack/spack-v0.22-inpreparation-b/opt/spack/linux-rhel8-icelake/gcc-8.5.0/llvm-18.1.1-q5ml52qgpewslx2e26irjh4g26eimfed/lib: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,-rpath,/home/sw/spack/spack-v0.22-inpreparation/opt/spack/linux-rhel8-icelake/clang-17.0.6/gsl-2.7.1-b3wxnooxvgcrmypmvj6zkbgclv3gc2ml/lib: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: -Wl,--as-needed: 'linker' input unused [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-L/home/sw/spack/spack-v0.22-inpreparation-b/opt/spack/linux-rhel8-icelake/gcc-8.5.0/llvm-18.1.1-q5ml52qgpewslx2e26irjh4g26eimfed/lib' [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-L/home/sw/spack/spack-v0.22-inpreparation/opt/spack/linux-rhel8-icelake/clang-17.0.6/gsl-2.7.1-b3wxnooxvgcrmypmvj6zkbgclv3gc2ml/lib' [-Wunused-command-line-argument]
opt try_gsl.ll -load-pass-plugin=/home/davide/repositories/ENZYME/Enzyme-v0.0.103/enzyme/build-gsl/Enzyme/LLVMEnzyme-18.so -passes=enzyme -enzyme-print -disable-output
prefn:
; Function Attrs: nounwind uwtable
define dso_local void @compute(i32 noundef %0, ptr nocapture noundef readonly %1, ptr noundef %2) #0 !dbg !45 {
  tail call void @llvm.dbg.value(metadata i32 %0, metadata !50, metadata !DIExpression()), !dbg !53
  tail call void @llvm.dbg.value(metadata ptr %1, metadata !51, metadata !DIExpression()), !dbg !53
  tail call void @llvm.dbg.value(metadata ptr %2, metadata !52, metadata !DIExpression()), !dbg !53
  %4 = sext i32 %0 to i64, !dbg !54
  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #8, !dbg !60
  ret void, !dbg !61
}

after simplification :
; Function Attrs: mustprogress nounwind willreturn uwtable
define dso_local void @preprocess_compute(i32 noundef %0, ptr nocapture noundef readonly %1, ptr noundef %2) #8 !dbg !183 {
  tail call void @llvm.dbg.value(metadata i32 %0, metadata !185, metadata !DIExpression()) #9, !dbg !188
  tail call void @llvm.dbg.value(metadata ptr %1, metadata !186, metadata !DIExpression()) #9, !dbg !188
  tail call void @llvm.dbg.value(metadata ptr %2, metadata !187, metadata !DIExpression()) #9, !dbg !188
  %4 = sext i32 %0 to i64, !dbg !189
  %5 = load double, ptr %1, align 8, !dbg !190, !tbaa !56
  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #10, !dbg !191
  ret void, !dbg !192
}

; Function Attrs: mustprogress nounwind willreturn memory(readwrite) uwtable
define internal void @diffecompute(i32 noundef %0, ptr nocapture noundef readonly %1, ptr nocapture %"'", ptr noundef %2, ptr %"'1") #9 !dbg !193 {
  %"'de" = alloca double, align 8
  store double 0.000000e+00, ptr %"'de", align 8
  %4 = sext i32 %0 to i64, !dbg !198
  %5 = load double, ptr %1, align 8, !dbg !199, !tbaa !56, !alias.scope !200, !noalias !203
  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !205
  br label %invert, !dbg !206

invert:                                           ; preds = %3
  %7 = call ptr @llvm.stacksave.p0(), !dbg !205
  %8 = alloca double, i64 %4, align 8, !dbg !205
  %9 = alloca double, i64 %4, align 8, !dbg !205
  call void @llvm.lifetime.start.p0(i64 -1, ptr %8), !dbg !205
  call void @llvm.lifetime.start.p0(i64 -1, ptr %9), !dbg !205
  %10 = call i32 @gsl_sf_legendre_deriv_array_e(i32 1, i64 %4, double %5, double 1.000000e+00, ptr %8, ptr %9), !dbg !205
  call void @llvm.lifetime.end.p0(i64 -1, ptr %8), !dbg !205
  %11 = icmp eq i64 %4, 0, !dbg !205
  br i1 %11, label %invert_end, label %invert_loop, !dbg !205

invert_loop:                                      ; preds = %invert_loop, %invert
  %12 = phi i64 [ 0, %invert ], [ %14, %invert_loop ], !dbg !205
  %13 = phi fast double [ 0.000000e+00, %invert ], [ %20, %invert_loop ], !dbg !205
  %14 = add nuw nsw i64 %12, 1, !dbg !205
  %15 = getelementptr inbounds double, ptr %9, i64 %12, !dbg !205
  %16 = getelementptr inbounds double, ptr %"'1", i64 %12, !dbg !205
  %17 = load double, ptr %15, align 8, !dbg !205
  %18 = load double, ptr %16, align 8, !dbg !205
  %19 = fmul fast double %17, %18, !dbg !205
  %20 = fadd fast double %13, %19, !dbg !205
  store double 0.000000e+00, ptr %16, align 8, !dbg !205
  %21 = icmp eq i64 %14, %4, !dbg !205
  br i1 %21, label %invert_end, label %invert_loop, !dbg !205

invert_end:                                       ; preds = %invert_loop, %invert
  %22 = phi fast double [ 0.000000e+00, %invert ], [ %20, %invert_loop ], !dbg !205
  call void @llvm.lifetime.end.p0(i64 -1, ptr %9), !dbg !205
  call void @llvm.stackrestore.p0(ptr %7), !dbg !205
  %23 = load double, ptr %"'de", align 8, !dbg !205
  %24 = fadd fast double %23, %22, !dbg !205
  store double %24, ptr %"'de", align 8, !dbg !205
  %25 = load double, ptr %"'de", align 8, !dbg !199
  store double 0.000000e+00, ptr %"'de", align 8, !dbg !199
  %26 = load double, ptr %"'", align 8, !dbg !199, !tbaa !56, !alias.scope !203, !noalias !200
  %27 = fadd fast double %26, %25, !dbg !199
  store double %27, ptr %"'", align 8, !dbg !199, !tbaa !56, !alias.scope !203, !noalias !200
  ret void
}

postfn:
; Function Attrs: mustprogress nounwind willreturn memory(readwrite) uwtable
define internal void @diffecompute(i32 noundef %0, ptr nocapture noundef readonly %1, ptr nocapture %"'", ptr noundef %2, ptr %"'1") #9 !dbg !193 {
  %"'de" = alloca double, align 8
  store double 0.000000e+00, ptr %"'de", align 8
  %4 = sext i32 %0 to i64, !dbg !198
  %5 = load double, ptr %1, align 8, !dbg !199, !tbaa !56, !alias.scope !200, !noalias !203
  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !205
  br label %invert, !dbg !206

invert:                                           ; preds = %3
  %7 = call ptr @llvm.stacksave.p0(), !dbg !205
  %8 = alloca double, i64 %4, align 8, !dbg !205
  %9 = alloca double, i64 %4, align 8, !dbg !205
  call void @llvm.lifetime.start.p0(i64 -1, ptr %8), !dbg !205
  call void @llvm.lifetime.start.p0(i64 -1, ptr %9), !dbg !205
  %10 = call i32 @gsl_sf_legendre_deriv_array_e(i32 1, i64 %4, double %5, double 1.000000e+00, ptr %8, ptr %9), !dbg !205
  call void @llvm.lifetime.end.p0(i64 -1, ptr %8), !dbg !205
  %11 = icmp eq i64 %4, 0, !dbg !205
  br i1 %11, label %invert_end, label %invert_loop, !dbg !205

invert_loop:                                      ; preds = %invert_loop, %invert
  %12 = phi i64 [ 0, %invert ], [ %14, %invert_loop ], !dbg !205
  %13 = phi fast double [ 0.000000e+00, %invert ], [ %20, %invert_loop ], !dbg !205
  %14 = add nuw nsw i64 %12, 1, !dbg !205
  %15 = getelementptr inbounds double, ptr %9, i64 %12, !dbg !205
  %16 = getelementptr inbounds double, ptr %"'1", i64 %12, !dbg !205
  %17 = load double, ptr %15, align 8, !dbg !205
  %18 = load double, ptr %16, align 8, !dbg !205
  %19 = fmul fast double %17, %18, !dbg !205
  %20 = fadd fast double %13, %19, !dbg !205
  store double 0.000000e+00, ptr %16, align 8, !dbg !205
  %21 = icmp eq i64 %14, %4, !dbg !205
  br i1 %21, label %invert_end, label %invert_loop, !dbg !205

invert_end:                                       ; preds = %invert_loop, %invert
  %22 = phi fast double [ 0.000000e+00, %invert ], [ %20, %invert_loop ], !dbg !205
  call void @llvm.lifetime.end.p0(i64 -1, ptr %9), !dbg !205
  call void @llvm.stackrestore.p0(ptr %7), !dbg !205
  %23 = load double, ptr %"'de", align 8, !dbg !205
  %24 = fadd fast double %23, %22, !dbg !205
  store double %24, ptr %"'de", align 8, !dbg !205
  %25 = load double, ptr %"'de", align 8, !dbg !199
  store double 0.000000e+00, ptr %"'de", align 8, !dbg !199
  %26 = load double, ptr %"'", align 8, !dbg !199, !tbaa !56, !alias.scope !203, !noalias !200
  %27 = fadd fast double %26, %25, !dbg !199
  store double %27, ptr %"'", align 8, !dbg !199, !tbaa !56, !alias.scope !203, !noalias !200
  ret void
}

Replacing print with print-activity yields:

opt try_gsl.ll -load-pass-plugin=/home/davide/repositories/ENZYME/Enzyme-v0.0.103/enzyme/build-gsl/Enzyme/LLVMEnzyme-18.so -passes=enzyme -enzyme-print-activity -disable-output
in new function diffecompute constant arg i32 %0
in new function diffecompute nonconstant arg ptr %1
in new function diffecompute nonconstant arg ptr %2
known inactive instruction from call   tail call void @llvm.dbg.value(metadata i32 %0, metadata !185, metadata !DIExpression()) #10, !dbg !188
  tail call void @llvm.dbg.value(metadata i32 %0, metadata !185, metadata !DIExpression()) #10, !dbg !188 cv=1 ci=1
known inactive instruction from call   tail call void @llvm.dbg.value(metadata ptr %1, metadata !186, metadata !DIExpression()) #10, !dbg !188
  tail call void @llvm.dbg.value(metadata ptr %1, metadata !186, metadata !DIExpression()) #10, !dbg !188 cv=1 ci=1
known inactive instruction from call   tail call void @llvm.dbg.value(metadata ptr %2, metadata !187, metadata !DIExpression()) #10, !dbg !188
  tail call void @llvm.dbg.value(metadata ptr %2, metadata !187, metadata !DIExpression()) #10, !dbg !188 cv=1 ci=1
checking if is constant[3]   %4 = sext i32 %0 to i64, !dbg !54
 constant instruction from known non-float non-writing instruction   %4 = sext i32 %0 to i64, !dbg !54
 Value const as integral 3   %4 = sext i32 %0 to i64, !dbg !54 Integer
  %4 = sext i32 %0 to i64, !dbg !54 cv=1 ci=1
checking if is constant[3]   %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
 < UPSEARCH1>  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
nonconstant(1)  up-inst   %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 op ptr %1
 <Value USESEARCH2>  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 UA=None
      considering use of   %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 -   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
Value nonconstant inst (uses):  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 user   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
 </Value USESEARCH2 const=0>  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
 Value nonconstant (couldn't disprove)[3]  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
 <Value USESEARCH2>  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 UA=None
      considering use of   %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 -   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
Value nonconstant inst (uses):  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 user   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
 </Value USESEARCH2 const=0>  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
 < UPSEARCH1>  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
nonconstant(1)  up-inst   %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 op ptr %1
couldnt decide fallback as nonconstant instruction(3):  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
  %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56 cv=0 ci=0
checking if is constant[3]   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
 < UPSEARCH1>  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
nonconstant(1)  up-call   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60 op   %5 = load double, ptr %1, align 8, !dbg !55, !tbaa !56
couldnt decide fallback as nonconstant instruction(3):  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60
 Value const as integral 3   %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60 Integer
  %6 = tail call i32 @gsl_sf_legendre_array_e(i32 noundef 1, i64 noundef %4, double noundef %5, double noundef 1.000000e+00, ptr noundef %2) #11, !dbg !60 cv=1 ci=0
  ret void, !dbg !61 cv=1 ci=1
davidedelvento commented 2 months ago

@wsmoses maybe this problem could be due to infinite allocation in the stack?

davidedelvento commented 3 weeks ago

@wsmoses did you have any time to look at this issue?

wsmoses commented 3 weeks ago

I have not unfortunately, do you want to find a time later this week or next to look at togther (it should be a fast fix, but might as well confirm it works in your integration tests together)

wsmoses commented 2 weeks ago

@davidedelvento pardon the delay, but does this fix it? https://github.com/EnzymeAD/Enzyme/pull/1977

wsmoses commented 2 weeks ago

Closing per persumed fix, reopen if persists

davidedelvento commented 1 week ago

This https://github.com/EnzymeAD/Enzyme/pull/1983 fixes is. Thanks a lot!