FluxML / NNlib.jl

Neural Network primitives with multiple backends
Other
204 stars 122 forks source link

Enzyme when computing gradients for conv from NNlib fails compiling #557

Closed mashu closed 10 months ago

mashu commented 11 months ago
(mwe) pkg> st
Project mwe v0.1.0
Status `~/mwe/Project.toml`
  [052768ef] CUDA v5.1.1
  [7da242da] Enzyme v0.11.11 `https://github.com/EnzymeAD/Enzyme.jl.git#main`
  [587475ba] Flux v0.14.7
  [872c559c] NNlib v0.9.9
  [02a925ec] cuDNN v1.2.1

julia> using Enzyme
       using CUDA
       using cuDNN
       using NNlib
       using Flux
       w = randn(Float32, 3, 3, 5, 7) |> gpu
       dw = zero(w) |> gpu
       loss(w, x) = sum(conv(x, w))
       x = randn(Float32, (3, 3, 5, 8)) |> gpu

       Enzyme.autodiff(Reverse, loss, Duplicated(w, dw), Const(x));

throws this error, sadly I can see only end of it because Julia seems to be printing everything and it is too many lines for my terminal to see the begining

!57637 = distinct !{!57637, !"addr13"}
!57638 = !DILocation(line: 51, scope: !57639, inlinedAt: !57640)
!57639 = distinct !DISubprogram(name: "reverse", linkageName: "julia_reverse_21601", scope: null, file: !36942, line: 51, type: !936, scopeLine: 51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !841, retainedNodes: !937)
!57640 = distinct !DILocation(line: 88, scope: !57626)
!57641 = distinct !DISubprogram(name: "unsafe_copyto!", linkageName: "julia_unsafe_copyto!_20043", scope: null, file: !977, line: 394, type: !936, scopeLine: 394, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57642 = !DILocation(line: 13, scope: !57643, inlinedAt: !57644)
!57643 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !1129, file: !1129, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57644 = !DILocation(line: 379, scope: !57645, inlinedAt: !57646)
!57645 = distinct !DISubprogram(name: "stream;", linkageName: "stream", scope: !1001, file: !1001, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57646 = !DILocation(line: 378, scope: !57645, inlinedAt: !57647)
!57647 = !DILocation(line: 394, scope: !57641)
!57648 = !DILocation(line: 380, scope: !57645, inlinedAt: !57646)
!57649 = !DILocation(line: 1021, scope: !57650, inlinedAt: !57648)
!57650 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !1209, file: !1209, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57651 = !DILocation(line: 648, scope: !57652, inlinedAt: !57653)
!57652 = distinct !DISubprogram(name: "check_top_bit;", linkageName: "check_top_bit", scope: !980, file: !980, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57653 = !DILocation(line: 759, scope: !57654, inlinedAt: !57655)
!57654 = distinct !DISubprogram(name: "toUInt64;", linkageName: "toUInt64", scope: !980, file: !980, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57655 = !DILocation(line: 789, scope: !57656, inlinedAt: !57657)
!57656 = distinct !DISubprogram(name: "UInt64;", linkageName: "UInt64", scope: !980, file: !980, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57657 = !DILocation(line: 7, scope: !57658, inlinedAt: !57659)
!57658 = distinct !DISubprogram(name: "convert;", linkageName: "convert", scope: !1126, file: !1126, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57659 = !DILocation(line: 538, scope: !57660, inlinedAt: !57661)
!57660 = distinct !DISubprogram(name: "cconvert;", linkageName: "cconvert", scope: !1129, file: !1129, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57661 = !DILocation(line: 356, scope: !57662, inlinedAt: !57663)
!57662 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !965, file: !965, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57663 = !DILocation(line: 27, scope: !57664, inlinedAt: !57665)
!57664 = distinct !DISubprogram(name: "#49;", linkageName: "#49", scope: !970, file: !970, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57665 = !DILocation(line: 32, scope: !57666, inlinedAt: !57667)
!57666 = distinct !DISubprogram(name: "check;", linkageName: "check", scope: !965, file: !965, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57667 = !DILocation(line: 26, scope: !57668, inlinedAt: !57669)
!57668 = distinct !DISubprogram(name: "cuMemcpyDtoHAsync_v2;", linkageName: "cuMemcpyDtoHAsync_v2", scope: !970, file: !970, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57669 = !DILocation(line: 397, scope: !57670, inlinedAt: !57647)
!57670 = distinct !DISubprogram(name: "#unsafe_copyto!#8;", linkageName: "#unsafe_copyto!#8", scope: !977, file: !977, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !18, retainedNodes: !937)
!57671 = !DILocation(line: 33, scope: !57666, inlinedAt: !57667)
!57672 = !DILocation(line: 382, scope: !57645, inlinedAt: !57646)
!57673 = distinct !DISubprogram(name: "#1057", linkageName: "julia_#1057_20033", scope: null, file: !18609, line: 592, type: !936, scopeLine: 592, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !19, retainedNodes: !937)
!57674 = !{!57675}
!57675 = distinct !{!57675, !57676, !"primal"}
!57676 = distinct !{!57676, !" diff: %"}
!57677 = !{!57678}
!57678 = distinct !{!57678, !57676, !"shadow_0"}
!57679 = !{!57680}
!57680 = distinct !{!57680, !57681, !"primal"}
!57681 = distinct !{!57681, !" diff: %ptls_load7172"}
!57682 = !{!57683}
!57683 = distinct !{!57683, !57681, !"shadow_0"}
!57684 = !DILocation(line: 592, scope: !57673)
!57685 = !DILocation(line: 18, scope: !57686, inlinedAt: !57687)
!57686 = distinct !DISubprogram(name: "initialize_context;", linkageName: "initialize_context", scope: !965, file: !965, type: !936, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !19, retainedNodes: !937)
!57687 = !DILocation(line: 3849, scope: !21045, inlinedAt: !57688)
!57688 = !DILocation(line: 858, scope: !21047, inlinedAt: !57689)
!57689 = !DILocation(line: 593, scope: !57673)
!57690 = !DILocation(line: 3850, scope: !21045, inlinedAt: !57688)
!57691 = distinct !DISubprogram(name: "reshape", linkageName: "julia_reshape_19964", scope: null, file: !19129, line: 149, type: !936, scopeLine: 149, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !340, retainedNodes: !937)
!57692 = !DILocation(line: 150, scope: !57691)
!57693 = !DILocation(line: 158, scope: !57691)
!57694 = !DILocation(line: 87, scope: !19213, inlinedAt: !57695)
!57695 = !DILocation(line: 84, scope: !19201, inlinedAt: !57696)
!57696 = !DILocation(line: 80, scope: !19203, inlinedAt: !57697)
!57697 = !DILocation(line: 829, scope: !19189, inlinedAt: !57693)
!57698 = distinct !DISubprogram(name: "reshape", linkageName: "julia_reshape_19852", scope: null, file: !19129, line: 149, type: !936, scopeLine: 149, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !294, retainedNodes: !937)
!57699 = !DILocation(line: 150, scope: !57698)
!57700 = !DILocation(line: 158, scope: !57698)
!57701 = !DILocation(line: 87, scope: !19303, inlinedAt: !57702)
!57702 = !DILocation(line: 84, scope: !19291, inlinedAt: !57703)
!57703 = !DILocation(line: 80, scope: !19293, inlinedAt: !57704)
!57704 = !DILocation(line: 829, scope: !19279, inlinedAt: !57700)

@wsmoses as requested, sorry I couldn't past the whole log

Older log before updating to Enzyme#master was

ERROR: Enzyme compilation failed.
Current scope: 
define internal fastcc void @julia__sort__38524({} addrspace(10)* noundef nonnull align 16 dereferenceable(40) %0, { i64, i64 } addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(16) %1) unnamed_addr #148 !dbg !7431 {
top:
  %2 = alloca { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, align 8
  %3 = alloca { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, align 8
  %4 = call {}*** @julia.get_pgcstack()
  %ptls_field41 = getelementptr inbounds {}**, {}*** %4, i64 2
  %5 = bitcast {}*** %ptls_field41 to i64***
  %ptls_load4243 = load i64**, i64*** %5, align 8, !tbaa !284
  %6 = getelementptr inbounds i64*, i64** %ptls_load4243, i64 2
  %safepoint = load i64*, i64** %6, align 8, !tbaa !288
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint), !dbg !7432
  fence syncscope("singlethread") seq_cst
  %7 = getelementptr inbounds { i64, i64 }, { i64, i64 } addrspace(11)* %1, i64 0, i32 0, !dbg !7433
  %8 = getelementptr inbounds { i64, i64 }, { i64, i64 } addrspace(11)* %1, i64 0, i32 1, !dbg !7438
  %unbox = load i64, i64 addrspace(11)* %7, align 8, !dbg !7441, !tbaa !288, !alias.scope !291, !noalias !294
  %9 = add i64 %unbox, 1, !dbg !7441
  %unbox2 = load i64, i64 addrspace(11)* %8, align 8, !dbg !7444, !tbaa !288, !alias.scope !291, !noalias !294
  %.not = icmp sgt i64 %9, %unbox2, !dbg !7444
  %unbox.unbox2 = select i1 %.not, i64 %unbox, i64 %unbox2, !dbg !7448
  %.not44 = icmp slt i64 %unbox.unbox2, %9, !dbg !7455
  br i1 %.not44, label %L54, label %L18.L27_crit_edge, !dbg !7454

L18.L27_crit_edge:                                ; preds = %top
  %.phi.trans.insert37 = addrspacecast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*
  %arrayptr_ptr.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %.phi.trans.insert37, i64 0, i32 0
  %arrayptr.pre = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !7463, !tbaa !1337, !alias.scope !1339, !noalias !1326
  %arrayref.fca.0.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %2, i64 0, i32 0
  %arrayref.fca.1.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %2, i64 0, i32 1
  %arrayref.fca.2.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %2, i64 0, i32 2
  %arrayref.fca.3.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %2, i64 0, i32 3
  %arrayref.fca.4.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %2, i64 0, i32 4
  %10 = addrspacecast { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %2 to { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(11)*
  %arrayref14.fca.0.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %3, i64 0, i32 0
  %arrayref14.fca.1.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %3, i64 0, i32 1
  %arrayref14.fca.2.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %3, i64 0, i32 2
  %arrayref14.fca.3.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %3, i64 0, i32 3
  %arrayref14.fca.4.gep = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %3, i64 0, i32 4
  %11 = addrspacecast { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }* %3 to { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(11)*
  %arrayflags_ptr20 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %.phi.trans.insert37, i64 0, i32 2
  %12 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*
  %13 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %12, i64 5
  br label %L27, !dbg !7454

L27:                                              ; preds = %L18.L27_crit_edge, %merge_own
  %nodecayed.arrayptr = phi {} addrspace(10)* , !dbg !7463
  %nodecayedoff.arrayptr = phi i64 , !dbg !7463
  %arrayptr = phi i8 addrspace(13)* [ %arrayptr.pre, %L18.L27_crit_edge ], [ %arrayptr16, %merge_own ], !dbg !7463
  %value_phi6 = phi i64 [ %9, %L18.L27_crit_edge ], [ %30, %merge_own ]
  %14 = add i64 %value_phi6, -1, !dbg !7463
  %15 = bitcast i8 addrspace(13)* %arrayptr to { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)*, !dbg !7463
  %16 = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %15, i64 %14, !dbg !7463
  %arrayref = load { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %16, align 8, !dbg !7463, !tbaa !1872, !alias.scope !303, !noalias !343
  %17 = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, 3, !dbg !7463
  %.not45 = icmp eq {} addrspace(10)* %17, null, !dbg !7463
  br i1 %.not45, label %fail, label %L30.preheader, !dbg !7463

L30.preheader:                                    ; preds = %L27
  %unbox953 = load i64, i64 addrspace(11)* %7, align 8, !dbg !7466, !tbaa !288, !alias.scope !291, !noalias !294
  %.not4654 = icmp slt i64 %unbox953, %value_phi6, !dbg !7466
  br i1 %.not4654, label %L33.lr.ph, label %L42, !dbg !7468

L33.lr.ph:                                        ; preds = %L30.preheader
  %arrayref.fca.0.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, 0
  %arrayref.fca.1.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, 1
  %arrayref.fca.2.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, 2
  %arrayref.fca.4.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, 4
  br label %L33, !dbg !7468

L33:                                              ; preds = %L33.lr.ph, %merge_own24
  %value_phi856 = phi i64 [ %value_phi6, %L33.lr.ph ], [ %18, %merge_own24 ]
  %nodecayed.arrayptr1155 = phi {} addrspace(10)* 
  %nodecayedoff.arrayptr1155 = phi i64 
  %arrayptr1155 = phi i8 addrspace(13)* [ %arrayptr, %L33.lr.ph ], [ %arrayptr28, %merge_own24 ]
  %18 = add nsw i64 %value_phi856, -1, !dbg !7469
  %19 = add i64 %value_phi856, -2, !dbg !7472
  %20 = bitcast i8 addrspace(13)* %arrayptr1155 to { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)*, !dbg !7472
  %21 = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %20, i64 %19, !dbg !7472
  %arrayref14 = load { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %21, align 8, !dbg !7472, !tbaa !1872, !alias.scope !303, !noalias !343
  %22 = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref14, 3, !dbg !7472
  %.not47 = icmp eq {} addrspace(10)* %22, null, !dbg !7472
  br i1 %.not47, label %fail12, label %pass13, !dbg !7472

L39:                                              ; preds = %pass13
  %arrayflags21 = load i16, i16 addrspace(11)* %arrayflags_ptr20, align 2, !dbg !7473, !tbaa !1322, !alias.scope !1325, !noalias !1326
  %23 = and i16 %arrayflags21, 3, !dbg !7473
  %has_owner22 = icmp eq i16 %23, 3, !dbg !7473
  br i1 %has_owner22, label %array_owned23, label %merge_own24, !dbg !7473

L42:                                              ; preds = %merge_own24, %pass13, %L30.preheader
  %.pre-phi = phi i64 [ %14, %L30.preheader ], [ %18, %pass13 ], [ %19, %merge_own24 ], !dbg !7476
  %arrayflags = load i16, i16 addrspace(11)* %arrayflags_ptr20, align 2, !dbg !7476, !tbaa !1322, !alias.scope !1325, !noalias !1326
  %24 = and i16 %arrayflags, 3, !dbg !7476
  %has_owner = icmp eq i16 %24, 3, !dbg !7476
  br i1 %has_owner, label %array_owned, label %merge_own, !dbg !7476

L54:                                              ; preds = %merge_own, %top
  ret void, !dbg !7478

fail:                                             ; preds = %L27
  call void @ijl_throw({} addrspace(12)* addrspacecast ({}* inttoptr (i64 140176175087584 to {}*) to {} addrspace(12)*)), !dbg !7463
  unreachable, !dbg !7463

fail12:                                           ; preds = %L33
  call void @ijl_throw({} addrspace(12)* addrspacecast ({}* inttoptr (i64 140176175087584 to {}*) to {} addrspace(12)*)), !dbg !7472
  unreachable, !dbg !7472

pass13:                                           ; preds = %L33
  store i32 %arrayref.fca.0.extract, i32* %arrayref.fca.0.gep, align 8, !dbg !7479, !noalias !380
  store i32 %arrayref.fca.1.extract, i32* %arrayref.fca.1.gep, align 4, !dbg !7479, !noalias !380
  store i32 %arrayref.fca.2.extract, i32* %arrayref.fca.2.gep, align 8, !dbg !7479, !noalias !380
  store {} addrspace(10)* %17, {} addrspace(10)** %arrayref.fca.3.gep, align 8, !dbg !7479, !noalias !380
  store {} addrspace(10)* %arrayref.fca.4.extract, {} addrspace(10)** %arrayref.fca.4.gep, align 8, !dbg !7479, !noalias !380
  %arrayref14.fca.0.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref14, 0, !dbg !7479
  store i32 %arrayref14.fca.0.extract, i32* %arrayref14.fca.0.gep, align 8, !dbg !7479, !noalias !380
  %arrayref14.fca.1.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref14, 1, !dbg !7479
  store i32 %arrayref14.fca.1.extract, i32* %arrayref14.fca.1.gep, align 4, !dbg !7479, !noalias !380
  %arrayref14.fca.2.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref14, 2, !dbg !7479
  store i32 %arrayref14.fca.2.extract, i32* %arrayref14.fca.2.gep, align 8, !dbg !7479, !noalias !380
  store {} addrspace(10)* %22, {} addrspace(10)** %arrayref14.fca.3.gep, align 8, !dbg !7479, !noalias !380
  %arrayref14.fca.4.extract = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref14, 4, !dbg !7479
  store {} addrspace(10)* %arrayref14.fca.4.extract, {} addrspace(10)** %arrayref14.fca.4.gep, align 8, !dbg !7479, !noalias !380
  %25 = call fastcc i8 @julia_isless_38466({ i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(32) %10, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(32) %11), !dbg !7479
  %26 = and i8 %25, 1, !dbg !7482
  %.not48 = icmp eq i8 %26, 0, !dbg !7482
  br i1 %.not48, label %L42, label %L39, !dbg !7481

array_owned:                                      ; preds = %L42
  %external_owner = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %13, align 8, !dbg !7476, !tbaa !288, !alias.scope !291, !noalias !294, !nonnull !283, !dereferenceable !1568, !align !406
  br label %merge_own, !dbg !7476

merge_own:                                        ; preds = %array_owned, %L42
  %data_owner = phi {} addrspace(10)* [ %0, %L42 ], [ %external_owner, %array_owned ], !dbg !7476
  %arrayptr16 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !7476, !tbaa !1337, !alias.scope !1339, !noalias !1326, !nonnull !283
  %27 = bitcast i8 addrspace(13)* %arrayptr16 to { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)*, !dbg !7476
  %28 = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %27, i64 %.pre-phi, !dbg !7476
  store { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %28, align 8, !dbg !7476, !tbaa !1872, !alias.scope !303, !noalias !304
  %29 = extractvalue { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref, 4, !dbg !7476
  call void ({} addrspace(10)*, ...) @julia.write_barrier({} addrspace(10)* nonnull %data_owner, {} addrspace(10)* nonnull %17, {} addrspace(10)* %29) #285, !dbg !7476
  %.not49 = icmp eq i64 %value_phi6, %unbox.unbox2, !dbg !7484
  %30 = add i64 %value_phi6, 1, !dbg !7486
  br i1 %.not49, label %L54, label %L27, !dbg !7487

array_owned23:                                    ; preds = %L39
  %external_owner25 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %13, align 8, !dbg !7473, !tbaa !288, !alias.scope !291, !noalias !294, !nonnull !283, !dereferenceable !1568, !align !406
  br label %merge_own24, !dbg !7473

merge_own24:                                      ; preds = %array_owned23, %L39
  %data_owner26 = phi {} addrspace(10)* [ %0, %L39 ], [ %external_owner25, %array_owned23 ], !dbg !7473
  %arrayptr28 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !7473, !tbaa !1337, !alias.scope !1339, !noalias !1326, !nonnull !283
  %31 = bitcast i8 addrspace(13)* %arrayptr28 to { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)*, !dbg !7473
  %32 = getelementptr inbounds { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* }, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %31, i64 %18, !dbg !7473
  store { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } %arrayref14, { i32, i32, i32, {} addrspace(10)*, {} addrspace(10)* } addrspace(13)* %32, align 8, !dbg !7473, !tbaa !1872, !alias.scope !303, !noalias !304
  call void ({} addrspace(10)*, ...) @julia.write_barrier({} addrspace(10)* nonnull %data_owner26, {} addrspace(10)* nonnull %22, {} addrspace(10)* %arrayref14.fca.4.extract) #285, !dbg !7473
  %unbox9 = load i64, i64 addrspace(11)* %7, align 8, !dbg !7466, !tbaa !288, !alias.scope !291, !noalias !294
  %.not46 = icmp slt i64 %unbox9, %18, !dbg !7466
  br i1 %.not46, label %L33, label %L42, !dbg !7468
}

Could not analyze garbage collection behavior of
 v0:   %arrayptr.pre = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !333, !tbaa !337, !alias.scope !340, !noalias !343
 v:   %arrayptr_ptr.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %.phi.trans.insert37, i64 0, i32 0
 offset: i64 0
 hasload: true

Stacktrace:
 [1] getindex
   @ ./essentials.jl:13
 [2] _sort!
   @ ./sort.jl:782
 [3] multiple call sites
   @ unknown:0

Stacktrace:
  [1] (::Enzyme.Compiler.var"#getparent#350"{…})(v::LLVM.GetElementPtrInst, offset::LLVM.ConstantInt, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/R8qo4/src/compiler/optimize.jl:304
  [2] (::Enzyme.Compiler.var"#getparent#350"{…})(v::LLVM.LoadInst, offset::LLVM.ConstantInt, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/R8qo4/src/compiler/optimize.jl:239
  [3] nodecayed_phis!(mod::LLVM.Module)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/R8qo4/src/compiler/optimize.jl:307
wsmoses commented 11 months ago

I need to see the whole log to be helpful, can you pipe it to a file

Eg in a terminal do

julia myfile.jl &> out.txt

mashu commented 11 months ago

error.log.gz

ketgg commented 11 months ago

Hey, I had similar issue with Enzyme compilation related to LLVM. If you have installed Julia from a package manager(it was pacman for me in Arch linux), try installing it from the website. It worked for me.

mashu commented 11 months ago

@re1san I am using official tarball

I have provided ways (code, version..etc) all needed to reproduce the error. I assumed developer asking for error log was just doing it for convenience and not because they can't reproduce it?

wsmoses commented 11 months ago

@mashu what julia version and OS are you on, I cannot reproduce on my 1.10/ubuntu setup with Enzyme main.

mashu commented 11 months ago

GNU/Linux Debian (się), Julia installed by https://github.com/JuliaLang/juliaup Julia version 1.10. I'll try to setup Docker image later on today if that helps, which should reproduce the problem.

wsmoses commented 11 months ago

Yeah that would help since I cannot reproduce your error on current main. The error you found on the latest release is distinct and has already been fixed in main, so you only need to consider using Enzyme#main

mashu commented 10 months ago

Actually after latest enzyme version bump, problem is gone. I cloned git #master before but maybe it didn't get the right commit, I am not sure.. now things seems to be working fine. Thanks. Closing this.