EnzymeAD / Enzyme.jl

Julia bindings for the Enzyme automatic differentiator
https://enzyme.mit.edu
MIT License
455 stars 64 forks source link

Julia Crashes from Assertion #1429

Closed avik-pal closed 6 months ago

avik-pal commented 6 months ago
using Lux, Enzyme, Random

model = Dense(10 => 10, gelu; use_bias=false)  # use_bias = false produces the segfault
ps, st = Lux.setup(Xoshiro(1234), model)
x = randn(Float32, 10)

loss_function(model, x, ps, st) = sum(abs2, first(model(x, ps, st)))

loss_function(model, x, ps, st)

begin
    dps = Enzyme.make_zero(ps)
    dx = Enzyme.make_zero(x)

    Enzyme.autodiff(Enzyme.Reverse, loss_function, Active, Const(model),
        Duplicated(x, dx), Duplicated(ps, dps), Const(st))

    dx, dps
end
julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:3791: bool GradientUtils::legalRecompute(const llvm::Value*, const ValueToValueMapTy&, llvm::IRBuilder<>*, bool, bool) const: Assertion `phi->getNumIncomingValues() != 0' failed.

If use_bias = true, then we don't get a julia segfault but it still errors.

If we don't specify an activation function, then it works fine

Crash Log ``` ; Function Attrs: mustprogress willreturn define internal fastcc void @preprocess_julia_fast_materialize_threaded__2755({} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="125797725781712" "enzymejl_parmtype_ref"="2" %0, { [1 x {} addrspace(10)*] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,0]:Pointer, [-1,0,0,-1]:Float@float, [-1,0,8]:Integer, [-1,0,9]:Integer, [-1,0,10]:Integer, [-1,0,11]:Integer, [-1,0,12]:Integer, [-1,0,13]:Integer, [-1,0,14]:Integer, [-1,0,15]:Integer, [-1,0,16]:Integer, [-1,0,17]:Integer, [-1,0,18]:Integer, [-1,0,19]:Integer, [-1,0,20]:Integer, [-1,0,21]:Integer, [-1,0,22]:Integer, [-1,0,23]:Integer, [-1,0,24]:Integer, [-1,0,25]:Integer, [-1,0,26]:Integer, [-1,0,27]:Integer, [-1,0,28]:Integer, [-1,0,29]:Integer, [-1,0,30]:Integer, [-1,0,31]:Integer, [-1,0,32]:Integer, [-1,0,33]:Integer, [-1,0,34]:Integer, [-1,0,35]:Integer, [-1,0,36]:Integer, [-1,0,37]:Integer, [-1,0,38]:Integer, [-1,0,39]:Integer}" "enzymejl_parmtype"="125797729165136" "enzymejl_parmtype_ref"="1" %1, [2 x [1 x i64]] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(16) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer}" "enzymejl_parmtype"="125797567845648" "enzymejl_parmtype_ref"="1" %2) unnamed_addr #88 !dbg !4085 { top: %3 = call noalias nonnull dereferenceable(56) dereferenceable_or_null(56) i8* @malloc(i64 56), !enzyme_fromstack !316 %4 = bitcast i8* %3 to { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }*, !enzyme_caststack !90 %.sub = bitcast { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4 to i8* %5 = call noalias nonnull dereferenceable(24) dereferenceable_or_null(24) i8* @malloc(i64 24), !enzyme_fromstack !968 %newstruct13 = bitcast i8* %5 to { [1 x [1 x i64]], [2 x i64] }*, !enzyme_caststack !90 %6 = call noalias nonnull dereferenceable(24) dereferenceable_or_null(24) i8* @malloc(i64 24), !enzyme_fromstack !968 %newstruct30 = bitcast i8* %6 to { [1 x [1 x i64]], [2 x i64] }*, !enzyme_caststack !90 %7 = call {}*** @julia.get_pgcstack() #91 %ptls_field170 = getelementptr inbounds {}**, {}*** %7, i64 2 %8 = bitcast {}*** %ptls_field170 to i64*** %ptls_load171172 = load i64**, i64*** %8, align 8, !tbaa !91 %9 = getelementptr inbounds i64*, i64** %ptls_load171172, i64 2 %safepoint = load i64*, i64** %9, align 8, !tbaa !95 fence syncscope("singlethread") seq_cst call void @julia.safepoint(i64* %safepoint) #91, !dbg !4086 fence syncscope("singlethread") seq_cst %10 = getelementptr inbounds [2 x [1 x i64]], [2 x [1 x i64]] addrspace(11)* %2, i64 0, i64 1, i64 0, !dbg !4087 %11 = call i64 @julia_nthreads_2932() #92, !dbg !4089 %unbox = load i64, i64 addrspace(11)* %10, align 8, !dbg !4090, !tbaa !95, !alias.scope !313, !noalias !314 %12 = icmp slt i64 %unbox, 1, !dbg !4090 br i1 %12, label %L616, label %L6, !dbg !4092 L6: ; preds = %top %13 = call i64 @llvm.smin.i64(i64 %unbox, i64 %11) #91, !dbg !4094 %.not = icmp eq i64 %13, 0, !dbg !4095 br i1 %.not, label %L393, label %L14, !dbg !4096 L14: ; preds = %L6 %14 = trunc i64 %13 to i32, !dbg !4097 %15 = add i32 %14, -1, !dbg !4097 %16 = call nonnull "enzyme_inactive" {}* @julia.pointer_from_objref({} addrspace(11)* noundef addrspacecast ({}* inttoptr (i64 125797279085328 to {}*) to {} addrspace(11)*)) #93, !dbg !4101 %17 = icmp sgt i32 %15, 0, !dbg !4103 br i1 %17, label %L24, label %L393, !dbg !4104 L24: ; preds = %L14 %p.i = bitcast {}* %16 to i64*, !dbg !4106 %v.i = atomicrmw xchg i64* %p.i, i64 0 acq_rel, align 8, !dbg !4106 %18 = call i64 @llvm.ctpop.i64(i64 %v.i) #91, !dbg !4109, !range !1713 %19 = trunc i64 %18 to i32, !dbg !4111 %20 = sub nsw i32 %15, %19, !dbg !4112 %21 = icmp slt i32 %20, 0, !dbg !4114 br i1 %21, label %L37, label %L72, !dbg !4117 L37: ; preds = %L24 %22 = call i64 @llvm.ctlz.i64(i64 %v.i, i1 noundef false) #91, !dbg !4118, !range !1713 %23 = trunc i64 %22 to i32, !dbg !4120 br label %L40, !dbg !4120 L40: ; preds = %L40, %L37 %iv = phi i64 [ %iv.next, %L40 ], [ 0, %L37 ] %value_phi119 = phi i32 [ %23, %L37 ], [ %24, %L40 ] %value_phi120 = phi i32 [ %20, %L37 ], [ %33, %L40 ] %value_phi121 = phi i64 [ %v.i, %L37 ], [ %29, %L40 ] %iv.next = add nuw nsw i64 %iv, 1, !dbg !4121 %24 = sub i32 %value_phi119, %value_phi120, !dbg !4121 %25 = sub i32 64, %24, !dbg !4123 %26 = zext i32 %25 to i64, !dbg !4125 %27 = icmp ugt i32 %25, 63, !dbg !4125 %notmask = shl nsw i64 -1, %26, !dbg !4123 %.op = xor i64 %notmask, -1, !dbg !4123 %28 = select i1 %27, i64 -1, i64 %.op, !dbg !4123 %29 = and i64 %28, %value_phi121, !dbg !4126 %30 = xor i64 %29, %value_phi121, !dbg !4128 %31 = call i64 @llvm.ctpop.i64(i64 %30) #91, !dbg !4129, !range !1713 %32 = trunc i64 %31 to i32, !dbg !4131 %33 = add i32 %value_phi120, %32, !dbg !4132 %.not185 = icmp eq i32 %33, 0, !dbg !4133 br i1 %.not185, label %L61, label %L40, !dbg !4134 L61: ; preds = %L40 %34 = xor i64 %29, -1, !dbg !4135 %35 = and i64 %v.i, %34, !dbg !4137 store atomic i64 %35, i64* %p.i release, align 16, !dbg !4138, !noalias !4139 br label %L72, !dbg !4142 L72: ; preds = %L61, %L24 %value_phi60 = phi i32 [ %15, %L61 ], [ %19, %L24 ] %value_phi61 = phi i64 [ %29, %L61 ], [ %v.i, %L24 ] %36 = icmp sgt i32 %value_phi60, 0, !dbg !4143 br i1 %36, label %L133.lr.ph, label %L393, !dbg !4144 L133.lr.ph: ; preds = %L72 %37 = zext i32 %value_phi60 to i64, !dbg !4145 %38 = add nuw nsw i64 %37, 1, !dbg !4162 %39 = udiv i64 %unbox, %38, !dbg !4164 %40 = mul i64 %39, %38, !dbg !4165 %41 = sub i64 %unbox, %40, !dbg !4167 %42 = addrspacecast {} addrspace(10)* %0 to {} addrspace(11)*, !dbg !4168 %43 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %42) #93, !dbg !4168 %44 = bitcast {}* %43 to i8**, !dbg !4168 %arrayptr64 = load i8*, i8** %44, align 8, !dbg !4168, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90 %45 = ptrtoint i8* %arrayptr64 to i64, !dbg !4168 %46 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !4178 %arraysize_ptr65 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %46, i64 3, !dbg !4178 %47 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr65 to i64 addrspace(11)*, !dbg !4178 %arraysize66 = load i64, i64 addrspace(11)* %47, align 8, !dbg !4178, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %arraysize_ptr67 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %46, i64 4, !dbg !4178 %48 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr67 to i64 addrspace(11)*, !dbg !4178 %arraysize68 = load i64, i64 addrspace(11)* %48, align 16, !dbg !4178, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %getfield_addr73 = getelementptr inbounds { [1 x {} addrspace(10)*] }, { [1 x {} addrspace(10)*] } addrspace(11)* %1, i64 0, i32 0, i64 0, !dbg !4184 %getfield74 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr73 unordered, align 8, !dbg !4184, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90, !dereferenceable !315, !align !316 %49 = addrspacecast {} addrspace(10)* %getfield74 to {} addrspace(11)*, !dbg !4188 %50 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %49) #93, !dbg !4188 %51 = bitcast {}* %50 to i8**, !dbg !4188 %arrayptr76 = load i8*, i8** %51, align 8, !dbg !4188, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90 %52 = ptrtoint i8* %arrayptr76 to i64, !dbg !4188 %53 = addrspacecast {} addrspace(10)* %getfield74 to {} addrspace(10)* addrspace(11)*, !dbg !4195 %arraysize_ptr77 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %53, i64 3, !dbg !4195 %54 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr77 to i64 addrspace(11)*, !dbg !4195 %arraysize78 = load i64, i64 addrspace(11)* %54, align 8, !dbg !4195, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %arraysize_ptr79 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %53, i64 4, !dbg !4195 %55 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr79 to i64 addrspace(11)*, !dbg !4195 %arraysize80 = load i64, i64 addrspace(11)* %55, align 16, !dbg !4195, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %56 = insertvalue [1 x {} addrspace(10)*] zeroinitializer, {} addrspace(10)* %getfield74, 0, !dbg !4201 %57 = load i64, i64 addrspace(11)* %10, align 8, !dbg !4202, !tbaa !95, !alias.scope !313, !noalias !314 %newstruct87.sroa.0.0..sroa_idx = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 0, i32 0, !dbg !4203 store i64 %45, i64* %newstruct87.sroa.0.0..sroa_idx, align 16, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %newstruct87.sroa.2.0..sroa_idx134 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 0, i32 1, i64 0, !dbg !4203 store i64 %arraysize66, i64* %newstruct87.sroa.2.0..sroa_idx134, align 8, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %newstruct87.sroa.3.0..sroa_idx135 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 0, i32 1, i64 1, !dbg !4203 store i64 %arraysize68, i64* %newstruct87.sroa.3.0..sroa_idx135, align 16, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %newstruct87.sroa.4.0..sroa_idx136 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 1, i64 0, !dbg !4203 store i64 %57, i64* %newstruct87.sroa.4.0..sroa_idx136, align 8, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %newstruct87.sroa.5.0..sroa_idx137 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 2, i32 0, i64 0, i32 0, !dbg !4203 store i64 %52, i64* %newstruct87.sroa.5.0..sroa_idx137, align 16, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %newstruct87.sroa.6.0..sroa_idx138 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 2, i32 0, i64 0, i32 1, i64 0, !dbg !4203 store i64 %arraysize78, i64* %newstruct87.sroa.6.0..sroa_idx138, align 8, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %newstruct87.sroa.7.0..sroa_idx139 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, i64 0, i32 2, i32 0, i64 0, i32 1, i64 1, !dbg !4203 store i64 %arraysize80, i64* %newstruct87.sroa.7.0..sroa_idx139, align 16, !dbg !4203, !tbaa !340, !alias.scope !1043, !noalias !4204 %58 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* nonnull %0, [1 x {} addrspace(10)*] %56) #91, !dbg !4175 %59 = icmp sgt i64 %41, -1 br label %L133, !dbg !4205 L133: ; preds = %L187, %L133.lr.ph %iv1 = phi i64 [ %iv.next2, %L187 ], [ 0, %L133.lr.ph ] %value_phi95200 = phi i64 [ %value_phi61, %L133.lr.ph ], [ %72, %L187 ] %value_phi93198 = phi i64 [ 0, %L133.lr.ph ], [ %66, %L187 ] %value_phi92197 = phi i32 [ 0, %L133.lr.ph ], [ %68, %L187 ] %iv.next2 = add nuw nsw i64 %iv1, 1, !dbg !4206 %60 = icmp ne i64 %value_phi95200, 0, !dbg !4206 call void @llvm.assume(i1 noundef %60) #91, !dbg !4209 %61 = call i64 @llvm.cttz.i64(i64 %value_phi95200, i1 noundef true) #91, !dbg !4210, !range !1713 %62 = trunc i64 %61 to i32, !dbg !4212 %63 = icmp ugt i64 %41, %iv1, !dbg !4213 %not.ifelse_cond96 = and i1 %59, %63, !dbg !4217 %64 = zext i1 %not.ifelse_cond96 to i64, !dbg !4217 %65 = add i64 %value_phi93198, %39, !dbg !4217 %66 = add i64 %65, %64, !dbg !4218 %67 = add nuw nsw i32 %62, 1, !dbg !4219 %68 = add i32 %67, %value_phi92197, !dbg !4221 %69 = zext i32 %67 to i64, !dbg !4223 %70 = lshr i64 %value_phi95200, %69, !dbg !4223 %71 = icmp eq i32 %62, 63, !dbg !4223 %72 = select i1 %71, i64 0, i64 %70, !dbg !4223 %73 = load i64, i64* inttoptr (i64 125797243527104 to i64*), align 64, !dbg !4225, !tbaa !247, !alias.scope !117, !noalias !120 %74 = shl i32 %68, 9, !dbg !4231 %75 = zext i32 %74 to i64, !dbg !4232 %76 = inttoptr i64 %73 to i8*, !dbg !4236 %77 = getelementptr i8, i8* %76, i64 %75, !dbg !4236 %78 = getelementptr i8, i8* %77, i64 8, !dbg !4237 %coercion = bitcast i8* %78 to i64*, !dbg !4243 store i64 ptrtoint (void (i64)* @jlcapi_BatchClosure_2763 to i64), i64* %coercion, align 1, !dbg !4243, !tbaa !331, !alias.scope !117, !noalias !4247 %79 = getelementptr i8, i8* %77, i64 16, !dbg !4248 %80 = bitcast i8* %79 to { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }**, !dbg !4252 store { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %4, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }** %80, align 1, !dbg !4252, !tbaa !331, !alias.scope !117, !noalias !4247 %81 = getelementptr i8, i8* %77, i64 24, !dbg !4256 %coercion98 = bitcast i8* %81 to i64*, !dbg !4260 store i64 %value_phi93198, i64* %coercion98, align 1, !dbg !4260, !tbaa !331, !alias.scope !117, !noalias !4247 %82 = getelementptr i8, i8* %77, i64 32, !dbg !4264 %coercion99 = bitcast i8* %82 to i64*, !dbg !4268 store i64 %66, i64* %coercion99, align 1, !dbg !4268, !tbaa !331, !alias.scope !117, !noalias !4247 %p.i128 = bitcast i8* %77 to i32*, !dbg !4272 %v.i129 = atomicrmw xchg i32* %p.i128, i32 0 acq_rel, align 4, !dbg !4272 %.not178 = icmp eq i32 %v.i129, 1, !dbg !4275 br i1 %.not178, label %L184, label %L187, !dbg !4276 L184: ; preds = %L133 call fastcc void @julia_wake_thread__2921(i32 zeroext %68) #91, !dbg !4276 br label %L187, !dbg !4276 L187: ; preds = %L184, %L133 %83 = icmp eq i64 %iv.next2, %37, !dbg !4277 br i1 %83, label %L189, label %L133, !dbg !4205 L189: ; preds = %L187 %84 = add i64 %66, 1, !dbg !4279 %.not179 = icmp sgt i64 %84, %unbox, !dbg !4281 %value_phi101 = select i1 %.not179, i64 %66, i64 %unbox, !dbg !4283 %.not180 = icmp sgt i64 %84, %value_phi101, !dbg !4287 %85 = shl i64 %arraysize66, 2, !dbg !4297 %86 = mul i64 %85, %66, !dbg !4307 %87 = getelementptr i8, i8* %arrayptr64, i64 %86, !dbg !4309 %88 = sub i64 %value_phi101, %66, !dbg !4310 %89 = select i1 %.not180, i64 0, i64 %88, !dbg !4310 %90 = shl i64 %arraysize78, 2, !dbg !4318 %91 = mul i64 %90, %66, !dbg !4329 %92 = getelementptr i8, i8* %arrayptr76, i64 %91, !dbg !4331 %93 = mul i64 %89, %arraysize66, !dbg !4332 %94 = call i64 @llvm.smax.i64(i64 %93, i64 noundef 0) #91, !dbg !4341 %.not181 = icmp slt i64 %93, 1, !dbg !4346 br i1 %.not181, label %L349, label %L301.preheader, !dbg !4347 L301.preheader: ; preds = %L189 br label %L301, !dbg !4348 L301: ; preds = %L301.preheader, %L301 %iv3 = phi i64 [ 0, %L301.preheader ], [ %iv.next4, %L301 ] %iv.next4 = add nuw nsw i64 %iv3, 1, !dbg !4349 %95 = shl i64 %iv3, 2, !dbg !4352 %96 = getelementptr i8, i8* %92, i64 %95, !dbg !4357 %coercion111 = bitcast i8* %96 to float*, !dbg !4358 %pointerref = load float, float* %coercion111, align 1, !dbg !4358, !tbaa !331, !alias.scope !117, !noalias !120 call void @llvm.lifetime.end.p0i8(i64 noundef 56, i8* noundef nonnull %.sub) #91 %97 = call fastcc float @julia_gelu_2739(float %pointerref) #91, !dbg !4355 %98 = getelementptr i8, i8* %87, i64 %95, !dbg !4362 %coercion112 = bitcast i8* %98 to float*, !dbg !4364 store float %97, float* %coercion112, align 1, !dbg !4364, !tbaa !331, !alias.scope !117, !noalias !4247 %exitcond202.not = icmp eq i64 %iv.next4, %94, !dbg !4368 br i1 %exitcond202.not, label %L349.loopexit, label %L301, !dbg !4348, !llvm.loop !4369 L349.loopexit: ; preds = %L301 br label %L349, !dbg !4370 L349: ; preds = %L349.loopexit, %L189 %99 = icmp eq i64 %value_phi61, 0, !dbg !4370 br i1 %99, label %L387, label %L355.preheader, !dbg !4372 L355.preheader: ; preds = %L349 br label %L355, !dbg !4373 L355: ; preds = %L355.preheader, %L385 %iv5 = phi i64 [ 0, %L355.preheader ], [ %iv.next6, %L385 ] %value_phi115194 = phi i64 [ %104, %L385 ], [ %value_phi61, %L355.preheader ] %value_phi114193 = phi i32 [ %106, %L385 ], [ 0, %L355.preheader ] %iv.next6 = add nuw nsw i64 %iv5, 1, !dbg !4376 %100 = call i64 @llvm.cttz.i64(i64 %value_phi115194, i1 noundef true) #91, !dbg !4376, !range !1713 %101 = trunc i64 %100 to i32, !dbg !4378 %102 = add nuw nsw i32 %101, 1, !dbg !4379 %103 = zext i32 %102 to i64, !dbg !4381 %104 = lshr i64 %value_phi115194, %103, !dbg !4381 %105 = icmp eq i32 %101, 63, !dbg !4381 %106 = add i32 %102, %value_phi114193, !dbg !4383 %107 = load i64, i64* inttoptr (i64 125797243527104 to i64*), align 64, !dbg !4385, !tbaa !247, !alias.scope !117, !noalias !120 %108 = shl i32 %106, 9, !dbg !4388 %109 = zext i32 %108 to i64, !dbg !4389 %110 = inttoptr i64 %107 to i8*, !dbg !4393 %111 = getelementptr i8, i8* %110, i64 %109, !dbg !4393 %p.i130 = bitcast i8* %111 to i32*, !dbg !4394 %v.i131190 = load atomic i32, i32* %p.i130 acquire, align 16, !dbg !4394 %.not183191 = icmp eq i32 %v.i131190, 0, !dbg !4396 br i1 %.not183191, label %L375.preheader, label %L385, !dbg !4373 L375.preheader: ; preds = %L355 br label %L375, !dbg !4397 L375: ; preds = %L375.preheader, %L382 %iv7 = phi i64 [ 0, %L375.preheader ], [ %iv.next8, %L382 ] %112 = trunc i64 %iv7 to i32 %iv.next8 = add nuw nsw i64 %iv7, 1 call void @llvm.lifetime.end.p0i8(i64 noundef 56, i8* noundef nonnull %.sub) #91 call void asm sideeffect "pause", "~{memory}"() #94, !dbg !4398 %113 = add i32 %112, 1, !dbg !4400 %114 = icmp ult i32 %113, 65537, !dbg !4401 br i1 %114, label %L382, label %L379, !dbg !4397 L379: ; preds = %L375 %115 = call fastcc i8 @julia_checktask_2772(i32 zeroext %106) #91, !dbg !4403 %116 = and i8 %115, 1, !dbg !4403 %.not184 = icmp eq i8 %116, 0, !dbg !4403 br i1 %.not184, label %L382, label %L385.loopexit, !dbg !4403 L382: ; preds = %L379, %L375 %v.i131 = load atomic i32, i32* %p.i130 acquire, align 16, !dbg !4394 %.not183 = icmp eq i32 %v.i131, 0, !dbg !4396 br i1 %.not183, label %L375, label %L385.loopexit, !dbg !4373 L385.loopexit: ; preds = %L379, %L382 br label %L385, !dbg !4370 L385: ; preds = %L385.loopexit, %L355 %117 = icmp eq i64 %104, 0, !dbg !4370 %118 = select i1 %105, i1 true, i1 %117, !dbg !4370 br i1 %118, label %L387.loopexit, label %L355, !dbg !4372 L387.loopexit: ; preds = %L385 br label %L387, !dbg !4404 L387: ; preds = %L387.loopexit, %L349 %v.i133 = atomicrmw or i64* %p.i, i64 %value_phi61 acq_rel, align 8, !dbg !4404 br label %L616, !dbg !4407 L393: ; preds = %L72, %L14, %L6 %119 = call i64 @llvm.smax.i64(i64 %unbox, i64 noundef 0) #91, !dbg !4408 %.not173.inv = icmp sgt i64 %unbox, 0, !dbg !4411 %value_phi7 = select i1 %.not173.inv, i64 %119, i64 0, !dbg !4411 %120 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !4419 %arraysize_ptr = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %120, i64 3, !dbg !4419 %121 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr to i64 addrspace(11)*, !dbg !4419 %arraysize = load i64, i64 addrspace(11)* %121, align 8, !dbg !4419, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %memcpy_refined_dst14 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct13, i64 0, i32 0, i64 0, i64 0, !dbg !4425 store i64 %arraysize, i64* %memcpy_refined_dst14, align 8, !dbg !4425, !tbaa !397, !alias.scope !399, !noalias !4427 %newstruct8.sroa.0.0..sroa_idx = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct13, i64 0, i32 1, i64 0, !dbg !4425 store i64 1, i64* %newstruct8.sroa.0.0..sroa_idx, align 8, !dbg !4425, !tbaa !397, !alias.scope !399, !noalias !4427 %newstruct8.sroa.5.0..sroa_idx146 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct13, i64 0, i32 1, i64 1, !dbg !4425 store i64 %value_phi7, i64* %newstruct8.sroa.5.0..sroa_idx146, align 8, !dbg !4425, !tbaa !397, !alias.scope !399, !noalias !4427 %arraysize_ptr15 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %120, i64 4, !dbg !4428 %122 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr15 to i64 addrspace(11)*, !dbg !4428 %arraysize16 = load i64, i64 addrspace(11)* %122, align 16, !dbg !4428, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %123 = icmp eq i64 %value_phi7, 0, !dbg !4432 %124 = add nsw i64 %value_phi7, -1, !dbg !4438 %125 = icmp ult i64 %124, %arraysize16, !dbg !4440 %126 = or i1 %123, %125, !dbg !4441 br i1 %126, label %L464, label %L461, !dbg !4431 L461: ; preds = %L393 %127 = addrspacecast { [1 x [1 x i64]], [2 x i64] }* %newstruct13 to { [1 x [1 x i64]], [2 x i64] } addrspace(11)*, !dbg !4431 call fastcc void @julia_throw_boundserror_2928({} addrspace(10)* nofree noundef nonnull align 16 dereferenceable(40) %0, { [1 x [1 x i64]], [2 x i64] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %127) #95, !dbg !4431 unreachable, !dbg !4431 L464: ; preds = %L393 %getfield_addr = getelementptr inbounds { [1 x {} addrspace(10)*] }, { [1 x {} addrspace(10)*] } addrspace(11)* %1, i64 0, i32 0, i64 0, !dbg !4442 %getfield = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr unordered, align 8, !dbg !4442, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90, !dereferenceable !315, !align !316 %128 = addrspacecast {} addrspace(10)* %getfield to {} addrspace(10)* addrspace(11)*, !dbg !4446 %arraysize_ptr25 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %128, i64 3, !dbg !4446 %129 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr25 to i64 addrspace(11)*, !dbg !4446 %arraysize26 = load i64, i64 addrspace(11)* %129, align 8, !dbg !4446, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %memcpy_refined_dst32 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct30, i64 0, i32 0, i64 0, i64 0, !dbg !4451 store i64 %arraysize26, i64* %memcpy_refined_dst32, align 8, !dbg !4451, !tbaa !397, !alias.scope !399, !noalias !4427 %newstruct8.sroa.0.0..sroa_idx142 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct30, i64 0, i32 1, i64 0, !dbg !4451 store i64 1, i64* %newstruct8.sroa.0.0..sroa_idx142, align 8, !dbg !4451, !tbaa !397, !alias.scope !399, !noalias !4427 %newstruct8.sroa.5.0..sroa_idx147 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct30, i64 0, i32 1, i64 1, !dbg !4451 store i64 %value_phi7, i64* %newstruct8.sroa.5.0..sroa_idx147, align 8, !dbg !4451, !tbaa !397, !alias.scope !399, !noalias !4427 %arraysize_ptr33 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %128, i64 4, !dbg !4453 %130 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr33 to i64 addrspace(11)*, !dbg !4453 %arraysize34 = load i64, i64 addrspace(11)* %130, align 16, !dbg !4453, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %131 = icmp ult i64 %124, %arraysize34, !dbg !4457 %132 = or i1 %123, %131, !dbg !4462 br i1 %132, label %L503, label %L500, !dbg !4456 L500: ; preds = %L464 %133 = addrspacecast { [1 x [1 x i64]], [2 x i64] }* %newstruct30 to { [1 x [1 x i64]], [2 x i64] } addrspace(11)*, !dbg !4456 call fastcc void @julia_throw_boundserror_2928({} addrspace(10)* nofree noundef nonnull align 16 dereferenceable(40) %getfield, { [1 x [1 x i64]], [2 x i64] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %133) #95, !dbg !4456 unreachable, !dbg !4456 L503: ; preds = %L464 %134 = mul i64 %arraysize, %value_phi7, !dbg !4463 %135 = call i64 @llvm.smax.i64(i64 %134, i64 noundef 0) #91, !dbg !4472 %.not174 = icmp slt i64 %134, 1, !dbg !4477 br i1 %.not174, label %L616, label %L552.lr.ph, !dbg !4478 L552.lr.ph: ; preds = %L503 %136 = addrspacecast {} addrspace(10)* %getfield to float addrspace(13)* addrspace(11)* %137 = addrspacecast {} addrspace(10)* %0 to float addrspace(13)* addrspace(11)* br label %L552, !dbg !4479 L552: ; preds = %L552, %L552.lr.ph %iv9 = phi i64 [ %iv.next10, %L552 ], [ 0, %L552.lr.ph ] %iv.next10 = add nuw nsw i64 %iv9, 1, !dbg !4480 %arrayptr176 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %136, align 16, !dbg !4483, !tbaa !95, !alias.scope !4487, !noalias !314, !llvm.mem.parallel_loop_access !4488, !nonnull !90 %138 = getelementptr inbounds float, float addrspace(13)* %arrayptr176, i64 %iv9, !dbg !4483 %arrayref = load float, float addrspace(13)* %138, align 4, !dbg !4483, !tbaa !177, !alias.scope !117, !noalias !120, !llvm.mem.parallel_loop_access !4488 %139 = call fastcc float @julia_gelu_2739(float %arrayref) #91, !dbg !4485, !llvm.mem.parallel_loop_access !4488 %arrayptr54177 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %137, align 16, !dbg !4490, !tbaa !95, !alias.scope !4487, !noalias !314, !llvm.mem.parallel_loop_access !4488, !nonnull !90 %140 = getelementptr inbounds float, float addrspace(13)* %arrayptr54177, i64 %iv9, !dbg !4490 store float %139, float addrspace(13)* %140, align 4, !dbg !4490, !tbaa !177, !alias.scope !117, !noalias !4247, !llvm.mem.parallel_loop_access !4488 %exitcond.not = icmp eq i64 %iv.next10, %135, !dbg !4492 br i1 %exitcond.not, label %L616.loopexit, label %L552, !dbg !4479, !llvm.loop !4489 L616.loopexit: ; preds = %L552 br label %L616 L616: ; preds = %L616.loopexit, %L503, %L387, %top call void @llvm.lifetime.end.p0i8(i64 noundef 56, i8* noundef nonnull %.sub) #91 ret void, !dbg !4493 } ; Function Attrs: mustprogress willreturn define internal fastcc void @diffejulia_fast_materialize_threaded__2755({} addrspace(10)* align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="125797725781712" "enzymejl_parmtype_ref"="2" %0, {} addrspace(10)* align 16 "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="125797725781712" "enzymejl_parmtype_ref"="2" %"'", { [1 x {} addrspace(10)*] } addrspace(11)* nocapture nofree readonly align 8 dereferenceable(8) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,0]:Pointer, [-1,0,0,-1]:Float@float, [-1,0,8]:Integer, [-1,0,9]:Integer, [-1,0,10]:Integer, [-1,0,11]:Integer, [-1,0,12]:Integer, [-1,0,13]:Integer, [-1,0,14]:Integer, [-1,0,15]:Integer, [-1,0,16]:Integer, [-1,0,17]:Integer, [-1,0,18]:Integer, [-1,0,19]:Integer, [-1,0,20]:Integer, [-1,0,21]:Integer, [-1,0,22]:Integer, [-1,0,23]:Integer, [-1,0,24]:Integer, [-1,0,25]:Integer, [-1,0,26]:Integer, [-1,0,27]:Integer, [-1,0,28]:Integer, [-1,0,29]:Integer, [-1,0,30]:Integer, [-1,0,31]:Integer, [-1,0,32]:Integer, [-1,0,33]:Integer, [-1,0,34]:Integer, [-1,0,35]:Integer, [-1,0,36]:Integer, [-1,0,37]:Integer, [-1,0,38]:Integer, [-1,0,39]:Integer}" "enzymejl_parmtype"="125797729165136" "enzymejl_parmtype_ref"="1" %1, { [1 x {} addrspace(10)*] } addrspace(11)* nocapture nofree align 8 "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,0]:Pointer, [-1,0,0,-1]:Float@float, [-1,0,8]:Integer, [-1,0,9]:Integer, [-1,0,10]:Integer, [-1,0,11]:Integer, [-1,0,12]:Integer, [-1,0,13]:Integer, [-1,0,14]:Integer, [-1,0,15]:Integer, [-1,0,16]:Integer, [-1,0,17]:Integer, [-1,0,18]:Integer, [-1,0,19]:Integer, [-1,0,20]:Integer, [-1,0,21]:Integer, [-1,0,22]:Integer, [-1,0,23]:Integer, [-1,0,24]:Integer, [-1,0,25]:Integer, [-1,0,26]:Integer, [-1,0,27]:Integer, [-1,0,28]:Integer, [-1,0,29]:Integer, [-1,0,30]:Integer, [-1,0,31]:Integer, [-1,0,32]:Integer, [-1,0,33]:Integer, [-1,0,34]:Integer, [-1,0,35]:Integer, [-1,0,36]:Integer, [-1,0,37]:Integer, [-1,0,38]:Integer, [-1,0,39]:Integer}" "enzymejl_parmtype"="125797729165136" "enzymejl_parmtype_ref"="1" %"'1", [2 x [1 x i64]] addrspace(11)* nocapture nofree readonly align 8 dereferenceable(16) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer}" "enzymejl_parmtype"="125797567845648" "enzymejl_parmtype_ref"="1" %2, { i8*, {} addrspace(10)*, i8*, {} addrspace(10)*, i64, i64, i32*, i64, i64, {} addrspace(10)*, i64*, i1*, i64, float*, i64*, i1*, i1**, i1**, i64, float* } %tapeArg) unnamed_addr #88 !dbg !5064 { top: %_replacementA9 = phi i8* %_replacementA8 = phi { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %.sub_replacementA = phi i8* %_replacementA7 = phi i8* %newstruct13_replacementA = phi { [1 x [1 x i64]], [2 x i64] }* %_replacementA6 = phi i8* %newstruct30_replacementA = phi { [1 x [1 x i64]], [2 x i64] }* %3 = call {}*** @julia.get_pgcstack() #91 %ptls_field170_replacementA = phi {}*** %_replacementA5 = phi i64*** %ptls_load171172_replacementA = phi i64** %_replacementA4 = phi i64** %safepoint_replacementA = phi i64* %_replacementA = phi i64 addrspace(11)* , !dbg !5065 %4 = call i64 @julia_nthreads_2932() #92, !dbg !5067 %unbox = load i64, i64 addrspace(11)* %_replacementA, align 8, !dbg !5068, !tbaa !95, !alias.scope !5072, !noalias !5075 %5 = icmp slt i64 %unbox, 1, !dbg !5068 br i1 %5, label %L616, label %L6, !dbg !5070 L6: ; preds = %top %6 = call i64 @llvm.smin.i64(i64 %unbox, i64 %4) #91, !dbg !5077 %.not = icmp eq i64 %6, 0, !dbg !5078 br i1 %.not, label %L393, label %L14, !dbg !5079 L14: ; preds = %L6 %7 = trunc i64 %6 to i32, !dbg !5080 %8 = add i32 %7, -1, !dbg !5080 %_replacementA10 = phi {}* , !dbg !5084 %9 = icmp sgt i32 %8, 0, !dbg !5086 br i1 %9, label %L24, label %L393, !dbg !5087 L24: ; preds = %L14 %p.i_replacementA = phi i64* , !dbg !5089 %v.i_replacementA = phi i64 , !dbg !5089 %10 = call i64 @llvm.ctpop.i64(i64 %v.i_replacementA) #91, !dbg !5092, !range !1713 %11 = trunc i64 %10 to i32, !dbg !5094 %12 = sub nsw i32 %8, %11, !dbg !5095 %13 = icmp slt i32 %12, 0, !dbg !5097 br i1 %13, label %L37, label %L72, !dbg !5100 L37: ; preds = %L24 %_replacementA12 = phi i64 , !dbg !5101 %_replacementA11 = phi i32 , !dbg !5103 br label %L40, !dbg !5103 L40: ; preds = %L40, %L37 %iv = phi i64 [ %iv.next, %L40 ], [ 0, %L37 ] %value_phi119_replacementA = phi i32 %value_phi120_replacementA = phi i32 %value_phi121_replacementA = phi i64 %iv.next = add nuw nsw i64 %iv, 1, !dbg !5104 %_replacementA21 = phi i32 , !dbg !5104 %_replacementA20 = phi i32 , !dbg !5106 %_replacementA19 = phi i64 , !dbg !5108 %_replacementA18 = phi i1 , !dbg !5108 %notmask_replacementA = phi i64 , !dbg !5106 %.op_replacementA = phi i64 , !dbg !5106 %_replacementA17 = phi i64 , !dbg !5106 %_replacementA16 = phi i64 , !dbg !5109 %_replacementA15 = phi i64 , !dbg !5111 %_replacementA14 = phi i64 , !dbg !5112 %_replacementA13 = phi i32 , !dbg !5114 %14 = add i32 %value_phi120_replacementA, %_replacementA13, !dbg !5115 %.not185 = icmp eq i32 %14, 0, !dbg !5116 br i1 %.not185, label %L61, label %L40, !dbg !5117 L61: ; preds = %L40 %_replacementA23 = phi i64 , !dbg !5118 %_replacementA22 = phi i64 , !dbg !5120 br label %L72, !dbg !5121 L72: ; preds = %L61, %L24 %value_phi60 = phi i32 [ %8, %L61 ], [ %11, %L24 ] %value_phi61 = phi i64 [ %_replacementA16, %L61 ], [ %v.i_replacementA, %L24 ] %15 = icmp sgt i32 %value_phi60, 0, !dbg !5122 br i1 %15, label %L133.lr.ph, label %L393, !dbg !5123 L133.lr.ph: ; preds = %L72 %16 = zext i32 %value_phi60 to i64, !dbg !5124 %17 = add nuw nsw i64 %16, 1, !dbg !5141 %18 = udiv i64 %unbox, %17, !dbg !5143 %19 = mul i64 %18, %17, !dbg !5144 %20 = sub i64 %unbox, %19, !dbg !5146 %21 = addrspacecast {} addrspace(10)* %0 to {} addrspace(11)*, !dbg !5147 %22 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %21) #93, !dbg !5147 %"'ip_phi" = phi {}* , !dbg !5147 %23 = bitcast {}* %22 to i8**, !dbg !5147 %arrayptr64 = load i8*, i8** %23, align 8, !dbg !5147, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90 %"arrayptr64'il_phi" = phi i8* , !dbg !5147 %24 = ptrtoint i8* %arrayptr64 to i64, !dbg !5147 %25 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !5157 %arraysize_ptr65 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %25, i64 3, !dbg !5157 %26 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr65 to i64 addrspace(11)*, !dbg !5157 %arraysize66 = load i64, i64 addrspace(11)* %26, align 8, !dbg !5157, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %arraysize_ptr67 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %25, i64 4, !dbg !5157 %27 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr67 to i64 addrspace(11)*, !dbg !5157 %arraysize68 = load i64, i64 addrspace(11)* %27, align 16, !dbg !5157, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %getfield_addr73 = getelementptr inbounds { [1 x {} addrspace(10)*] }, { [1 x {} addrspace(10)*] } addrspace(11)* %1, i64 0, i32 0, i64 0, !dbg !5163 %getfield74 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr73 unordered, align 8, !dbg !5163, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90, !dereferenceable !315, !align !316 %"getfield74'il_phi" = phi {} addrspace(10)* , !dbg !5167 %28 = addrspacecast {} addrspace(10)* %getfield74 to {} addrspace(11)*, !dbg !5167 %29 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %28) #93, !dbg !5167 %"'ip_phi2" = phi {}* , !dbg !5167 %30 = bitcast {}* %29 to i8**, !dbg !5167 %arrayptr76 = load i8*, i8** %30, align 8, !dbg !5167, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90 %"arrayptr76'il_phi" = phi i8* , !dbg !5167 %31 = ptrtoint i8* %arrayptr76 to i64, !dbg !5167 %32 = addrspacecast {} addrspace(10)* %getfield74 to {} addrspace(10)* addrspace(11)*, !dbg !5174 %arraysize_ptr77 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %32, i64 3, !dbg !5174 %33 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr77 to i64 addrspace(11)*, !dbg !5174 %arraysize78 = load i64, i64 addrspace(11)* %33, align 8, !dbg !5174, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %arraysize_ptr79 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %32, i64 4, !dbg !5174 %34 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr79 to i64 addrspace(11)*, !dbg !5174 %arraysize80 = load i64, i64 addrspace(11)* %34, align 16, !dbg !5174, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %35 = insertvalue [1 x {} addrspace(10)*] zeroinitializer, {} addrspace(10)* %getfield74, 0, !dbg !5180 %36 = load i64, i64 addrspace(11)* %_replacementA, align 8, !dbg !5181, !tbaa !95, !alias.scope !313, !noalias !314 %newstruct87.sroa.0.0..sroa_idx = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 0, i32 0, !dbg !5182 store i64 %24, i64* %newstruct87.sroa.0.0..sroa_idx, align 16, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %newstruct87.sroa.2.0..sroa_idx134 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 0, i32 1, i64 0, !dbg !5182 store i64 %arraysize66, i64* %newstruct87.sroa.2.0..sroa_idx134, align 8, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %newstruct87.sroa.3.0..sroa_idx135 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 0, i32 1, i64 1, !dbg !5182 store i64 %arraysize68, i64* %newstruct87.sroa.3.0..sroa_idx135, align 16, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %newstruct87.sroa.4.0..sroa_idx136 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 1, i64 0, !dbg !5182 store i64 %36, i64* %newstruct87.sroa.4.0..sroa_idx136, align 8, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %newstruct87.sroa.5.0..sroa_idx137 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 2, i32 0, i64 0, i32 0, !dbg !5182 store i64 %31, i64* %newstruct87.sroa.5.0..sroa_idx137, align 16, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %newstruct87.sroa.6.0..sroa_idx138 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 2, i32 0, i64 0, i32 1, i64 0, !dbg !5182 store i64 %arraysize78, i64* %newstruct87.sroa.6.0..sroa_idx138, align 8, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %newstruct87.sroa.7.0..sroa_idx139 = getelementptr inbounds { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, i64 0, i32 2, i32 0, i64 0, i32 1, i64 1, !dbg !5182 store i64 %arraysize80, i64* %newstruct87.sroa.7.0..sroa_idx139, align 16, !dbg !5182, !tbaa !340, !alias.scope !1043, !noalias !5183 %37 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* nonnull %0, [1 x {} addrspace(10)*] %35) #91, !dbg !5154 %"'ip" = call token (...) @llvm.julia.gc_preserve_begin(), !dbg !5154 %38 = icmp sgt i64 %20, -1 %39 = add nsw i64 %16, -1, !dbg !5186 br label %L133, !dbg !5186 L133: ; preds = %L187, %L133.lr.ph %iv1 = phi i64 [ %iv.next2, %L187 ], [ 0, %L133.lr.ph ] %value_phi95200 = phi i64 [ %value_phi61, %L133.lr.ph ], [ %52, %L187 ] %value_phi93198 = phi i64 [ 0, %L133.lr.ph ], [ %46, %L187 ] %value_phi92197 = phi i32 [ 0, %L133.lr.ph ], [ %48, %L187 ] %iv.next2 = add nuw nsw i64 %iv1, 1, !dbg !5187 %40 = icmp ne i64 %value_phi95200, 0, !dbg !5187 call void @llvm.assume(i1 noundef %40) #91, !dbg !5190 %41 = call i64 @llvm.cttz.i64(i64 %value_phi95200, i1 noundef true) #91, !dbg !5191, !range !1713 %42 = trunc i64 %41 to i32, !dbg !5193 %43 = icmp ugt i64 %20, %iv1, !dbg !5194 %not.ifelse_cond96 = and i1 %38, %43, !dbg !5198 %44 = zext i1 %not.ifelse_cond96 to i64, !dbg !5198 %45 = add i64 %value_phi93198, %18, !dbg !5198 %46 = add i64 %45, %44, !dbg !5199 %47 = add nuw nsw i32 %42, 1, !dbg !5200 %48 = add i32 %47, %value_phi92197, !dbg !5202 %49 = zext i32 %47 to i64, !dbg !5204 %50 = lshr i64 %value_phi95200, %49, !dbg !5204 %51 = icmp eq i32 %42, 63, !dbg !5204 %52 = select i1 %51, i64 0, i64 %50, !dbg !5204 %53 = load i64, i64* inttoptr (i64 125797243527104 to i64*), align 64, !dbg !5206, !tbaa !247, !alias.scope !117, !noalias !120 %"'il_phi" = phi i64 , !dbg !5212 %54 = shl i32 %48, 9, !dbg !5212 %55 = zext i32 %54 to i64, !dbg !5213 %56 = inttoptr i64 %53 to i8*, !dbg !5217 %57 = getelementptr i8, i8* %56, i64 %55, !dbg !5217 %58 = getelementptr i8, i8* %57, i64 8, !dbg !5218 %coercion = bitcast i8* %58 to i64*, !dbg !5224 store i64 ptrtoint (void (i64)* @jlcapi_BatchClosure_2763 to i64), i64* %coercion, align 1, !dbg !5224, !tbaa !331, !alias.scope !117, !noalias !5228 %59 = getelementptr i8, i8* %57, i64 16, !dbg !5229 %60 = bitcast i8* %59 to { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }**, !dbg !5233 store { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }* %_replacementA8, { { i64, [2 x i64] }, [1 x i64], { [1 x { i64, [2 x i64] }] } }** %60, align 1, !dbg !5233, !tbaa !331, !alias.scope !117, !noalias !5228 %61 = getelementptr i8, i8* %57, i64 24, !dbg !5237 %coercion98 = bitcast i8* %61 to i64*, !dbg !5241 store i64 %value_phi93198, i64* %coercion98, align 1, !dbg !5241, !tbaa !331, !alias.scope !117, !noalias !5228 %62 = getelementptr i8, i8* %57, i64 32, !dbg !5245 %coercion99 = bitcast i8* %62 to i64*, !dbg !5249 store i64 %46, i64* %coercion99, align 1, !dbg !5249, !tbaa !331, !alias.scope !117, !noalias !5228 %p.i128 = bitcast i8* %57 to i32*, !dbg !5253 %v.i129 = atomicrmw xchg i32* %p.i128, i32 0 acq_rel, align 4, !dbg !5253 %.not178 = icmp eq i32 %v.i129, 1, !dbg !5256 br i1 %.not178, label %L184, label %L187, !dbg !5257 L184: ; preds = %L133 call fastcc void @julia_wake_thread__2921(i32 zeroext %48) #91, !dbg !5257 br label %L187, !dbg !5257 L187: ; preds = %L184, %L133 %63 = icmp eq i64 %iv.next2, %16, !dbg !5258 br i1 %63, label %L189, label %L133, !dbg !5186 L189: ; preds = %L187 %64 = add i64 %46, 1, !dbg !5260 %.not179 = icmp sgt i64 %64, %unbox, !dbg !5262 %value_phi101 = select i1 %.not179, i64 %46, i64 %unbox, !dbg !5264 %.not180 = icmp sgt i64 %64, %value_phi101, !dbg !5268 %65 = shl i64 %arraysize66, 2, !dbg !5278 %66 = mul i64 %65, %46, !dbg !5288 %67 = getelementptr i8, i8* %arrayptr64, i64 %66, !dbg !5290 %68 = sub i64 %value_phi101, %46, !dbg !5291 %69 = select i1 %.not180, i64 0, i64 %68, !dbg !5291 %70 = shl i64 %arraysize78, 2, !dbg !5299 %71 = mul i64 %70, %46, !dbg !5310 %72 = getelementptr i8, i8* %arrayptr76, i64 %71, !dbg !5312 %73 = mul i64 %69, %arraysize66, !dbg !5313 %74 = call i64 @llvm.smax.i64(i64 %73, i64 noundef 0) #91, !dbg !5322 %.not181 = icmp slt i64 %73, 1, !dbg !5327 br i1 %.not181, label %L349, label %L301.preheader, !dbg !5328 L301.preheader: ; preds = %L189 %75 = add nsw i64 %74, -1, !dbg !5329 br label %L301, !dbg !5329 L301: ; preds = %L301, %L301.preheader %iv3 = phi i64 [ 0, %L301.preheader ], [ %iv.next4, %L301 ] %iv.next4 = add nuw nsw i64 %iv3, 1, !dbg !5330 %76 = shl i64 %iv3, 2, !dbg !5333 %77 = getelementptr i8, i8* %72, i64 %76, !dbg !5338 %coercion111 = bitcast i8* %77 to float*, !dbg !5339 %pointerref = load float, float* %coercion111, align 1, !dbg !5339, !tbaa !331, !alias.scope !117, !noalias !120 call void @llvm.lifetime.end.p0i8(i64 noundef 56, i8* noundef nonnull %.sub_replacementA) #91 %78 = call fastcc float @julia_gelu_2739(float %pointerref) #91, !dbg !5336 %79 = getelementptr i8, i8* %67, i64 %76, !dbg !5343 %coercion112 = bitcast i8* %79 to float*, !dbg !5345 store float %78, float* %coercion112, align 1, !dbg !5345, !tbaa !331, !alias.scope !117, !noalias !5228 %exitcond202.not = icmp eq i64 %iv.next4, %74, !dbg !5349 br i1 %exitcond202.not, label %L349.loopexit, label %L301, !dbg !5329, !llvm.loop !5350 L349.loopexit: ; preds = %L301 br label %L349, !dbg !5351 L349: ; preds = %L349.loopexit, %L189 %80 = icmp eq i64 %value_phi61, 0, !dbg !5351 br i1 %80, label %L387, label %L355.preheader, !dbg !5353 L355.preheader: ; preds = %L349 br label %L355, !dbg !5354 L355: ; preds = %L385, %L355.preheader %iv5 = phi i64 [ 0, %L355.preheader ], [ %iv.next6, %L385 ] %value_phi115194 = phi i64 [ %85, %L385 ], [ %value_phi61, %L355.preheader ] %value_phi114193 = phi i32 [ %87, %L385 ], [ 0, %L355.preheader ] %iv.next6 = add nuw nsw i64 %iv5, 1, !dbg !5357 %81 = call i64 @llvm.cttz.i64(i64 %value_phi115194, i1 noundef true) #91, !dbg !5357, !range !1713 %82 = trunc i64 %81 to i32, !dbg !5359 %83 = add nuw nsw i32 %82, 1, !dbg !5360 %84 = zext i32 %83 to i64, !dbg !5362 %85 = lshr i64 %value_phi115194, %84, !dbg !5362 %86 = icmp eq i32 %82, 63, !dbg !5362 %87 = add i32 %83, %value_phi114193, !dbg !5364 %88 = load i64, i64* inttoptr (i64 125797243527104 to i64*), align 64, !dbg !5366, !tbaa !247, !alias.scope !117, !noalias !120 %"'il_phi3" = phi i64 , !dbg !5369 %89 = shl i32 %87, 9, !dbg !5369 %90 = zext i32 %89 to i64, !dbg !5370 %91 = inttoptr i64 %88 to i8*, !dbg !5374 %92 = getelementptr i8, i8* %91, i64 %90, !dbg !5374 %p.i130 = bitcast i8* %92 to i32*, !dbg !5375 %v.i131190 = load atomic i32, i32* %p.i130 acquire, align 16, !dbg !5375 %"v.i131190'il_phi" = phi i32 , !dbg !5377 %.not183191 = icmp eq i32 %v.i131190, 0, !dbg !5377 br i1 %.not183191, label %L375.preheader, label %L385, !dbg !5354 L375.preheader: ; preds = %L355 br label %L375, !dbg !5378 L375: ; preds = %L382, %L375.preheader %iv7 = phi i64 [ 0, %L375.preheader ], [ %iv.next8, %L382 ] %iv.next8 = add nuw nsw i64 %iv7, 1 %93 = trunc i64 %iv7 to i32 call void @llvm.lifetime.end.p0i8(i64 noundef 56, i8* noundef nonnull %.sub_replacementA) #91 call void asm sideeffect "pause", "~{memory}"() #94, !dbg !5379 %94 = add i32 %93, 1, !dbg !5381 %95 = icmp ult i32 %94, 65537, !dbg !5382 br i1 %95, label %L382, label %L379, !dbg !5378 L379: ; preds = %L375 %96 = call fastcc i8 @julia_checktask_2772(i32 zeroext %87) #91, !dbg !5384 %97 = and i8 %96, 1, !dbg !5384 %.not184 = icmp eq i8 %97, 0, !dbg !5384 br i1 %.not184, label %L382, label %L385.loopexit, !dbg !5384 L382: ; preds = %L379, %L375 %v.i131 = load atomic i32, i32* %p.i130 acquire, align 16, !dbg !5375 %"v.i131'il_phi" = phi i32 , !dbg !5377 %.not183 = icmp eq i32 %v.i131, 0, !dbg !5377 br i1 %.not183, label %L375, label %L385.loopexit, !dbg !5354 L385.loopexit: ; preds = %L382, %L379 br label %L385, !dbg !5351 L385: ; preds = %L385.loopexit, %L355 %98 = icmp eq i64 %85, 0, !dbg !5351 %99 = select i1 %86, i1 true, i1 %98, !dbg !5351 br i1 %99, label %L387.loopexit, label %L355, !dbg !5353 L387.loopexit: ; preds = %L385 br label %L387, !dbg !5385 L387: ; preds = %L387.loopexit, %L349 %v.i133 = atomicrmw or i64* %p.i_replacementA, i64 %value_phi61 acq_rel, align 8, !dbg !5385 br label %L616, !dbg !5388 L393: ; preds = %L72, %L14, %L6 %100 = call i64 @llvm.smax.i64(i64 %unbox, i64 noundef 0) #91, !dbg !5389 %.not173.inv = icmp sgt i64 %unbox, 0, !dbg !5392 %value_phi7 = select i1 %.not173.inv, i64 %100, i64 0, !dbg !5392 %101 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !5400 %arraysize_ptr = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %101, i64 3, !dbg !5400 %102 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr to i64 addrspace(11)*, !dbg !5400 %arraysize = load i64, i64 addrspace(11)* %102, align 8, !dbg !5400, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %memcpy_refined_dst14 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct13_replacementA, i64 0, i32 0, i64 0, i64 0, !dbg !5406 store i64 %arraysize, i64* %memcpy_refined_dst14, align 8, !dbg !5406, !tbaa !397, !alias.scope !399, !noalias !5408 %newstruct8.sroa.0.0..sroa_idx = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct13_replacementA, i64 0, i32 1, i64 0, !dbg !5406 store i64 1, i64* %newstruct8.sroa.0.0..sroa_idx, align 8, !dbg !5406, !tbaa !397, !alias.scope !399, !noalias !5408 %newstruct8.sroa.5.0..sroa_idx146 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct13_replacementA, i64 0, i32 1, i64 1, !dbg !5406 store i64 %value_phi7, i64* %newstruct8.sroa.5.0..sroa_idx146, align 8, !dbg !5406, !tbaa !397, !alias.scope !399, !noalias !5408 %arraysize_ptr15 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %101, i64 4, !dbg !5409 %103 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr15 to i64 addrspace(11)*, !dbg !5409 %arraysize16 = load i64, i64 addrspace(11)* %103, align 16, !dbg !5409, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %104 = icmp eq i64 %value_phi7, 0, !dbg !5413 %105 = add nsw i64 %value_phi7, -1, !dbg !5419 %106 = icmp ult i64 %105, %arraysize16, !dbg !5421 %107 = or i1 %104, %106, !dbg !5422 br i1 %107, label %L464, label %L461, !dbg !5412 L461: ; preds = %L393 %108 = addrspacecast { [1 x [1 x i64]], [2 x i64] }* %newstruct13_replacementA to { [1 x [1 x i64]], [2 x i64] } addrspace(11)*, !dbg !5412 call fastcc void @julia_throw_boundserror_2928({} addrspace(10)* nofree noundef nonnull align 16 dereferenceable(40) %0, { [1 x [1 x i64]], [2 x i64] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %108) #95, !dbg !5412 unreachable, !dbg !5412 L464: ; preds = %L393 %getfield_addr = getelementptr inbounds { [1 x {} addrspace(10)*] }, { [1 x {} addrspace(10)*] } addrspace(11)* %1, i64 0, i32 0, i64 0, !dbg !5423 %getfield = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr unordered, align 8, !dbg !5423, !tbaa !95, !alias.scope !313, !noalias !314, !nonnull !90, !dereferenceable !315, !align !316 %"getfield'il_phi" = phi {} addrspace(10)* , !dbg !5427 %109 = addrspacecast {} addrspace(10)* %getfield to {} addrspace(10)* addrspace(11)*, !dbg !5427 %arraysize_ptr25 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %109, i64 3, !dbg !5427 %110 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr25 to i64 addrspace(11)*, !dbg !5427 %arraysize26 = load i64, i64 addrspace(11)* %110, align 8, !dbg !5427, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %memcpy_refined_dst32 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct30_replacementA, i64 0, i32 0, i64 0, i64 0, !dbg !5432 store i64 %arraysize26, i64* %memcpy_refined_dst32, align 8, !dbg !5432, !tbaa !397, !alias.scope !399, !noalias !5408 %newstruct8.sroa.0.0..sroa_idx142 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct30_replacementA, i64 0, i32 1, i64 0, !dbg !5432 store i64 1, i64* %newstruct8.sroa.0.0..sroa_idx142, align 8, !dbg !5432, !tbaa !397, !alias.scope !399, !noalias !5408 %newstruct8.sroa.5.0..sroa_idx147 = getelementptr inbounds { [1 x [1 x i64]], [2 x i64] }, { [1 x [1 x i64]], [2 x i64] }* %newstruct30_replacementA, i64 0, i32 1, i64 1, !dbg !5432 store i64 %value_phi7, i64* %newstruct8.sroa.5.0..sroa_idx147, align 8, !dbg !5432, !tbaa !397, !alias.scope !399, !noalias !5408 %arraysize_ptr33 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %109, i64 4, !dbg !5434 %111 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr33 to i64 addrspace(11)*, !dbg !5434 %arraysize34 = load i64, i64 addrspace(11)* %111, align 16, !dbg !5434, !tbaa !95, !range !131, !alias.scope !313, !noalias !314 %112 = icmp ult i64 %105, %arraysize34, !dbg !5438 %113 = or i1 %104, %112, !dbg !5443 br i1 %113, label %L503, label %L500, !dbg !5437 L500: ; preds = %L464 %114 = addrspacecast { [1 x [1 x i64]], [2 x i64] }* %newstruct30_replacementA to { [1 x [1 x i64]], [2 x i64] } addrspace(11)*, !dbg !5437 call fastcc void @julia_throw_boundserror_2928({} addrspace(10)* nofree noundef nonnull align 16 dereferenceable(40) %getfield, { [1 x [1 x i64]], [2 x i64] } addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %114) #95, !dbg !5437 unreachable, !dbg !5437 L503: ; preds = %L464 %115 = mul i64 %arraysize, %value_phi7, !dbg !5444 %116 = call i64 @llvm.smax.i64(i64 %115, i64 noundef 0) #91, !dbg !5453 %.not174 = icmp slt i64 %115, 1, !dbg !5458 br i1 %.not174, label %L616, label %L552.lr.ph, !dbg !5459 L552.lr.ph: ; preds = %L503 %117 = addrspacecast {} addrspace(10)* %getfield to float addrspace(13)* addrspace(11)* %118 = addrspacecast {} addrspace(10)* %0 to float addrspace(13)* addrspace(11)* %119 = add nsw i64 %116, -1, !dbg !5460 br label %L552, !dbg !5460 L552: ; preds = %L552, %L552.lr.ph %iv9 = phi i64 [ %iv.next10, %L552 ], [ 0, %L552.lr.ph ] %iv.next10 = add nuw nsw i64 %iv9, 1, !dbg !5461 %arrayptr176 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %117, align 16, !dbg !5464, !tbaa !95, !alias.scope !5468, !noalias !314, !llvm.mem.parallel_loop_access !5469, !nonnull !90 %"arrayptr176'il_phi" = phi float addrspace(13)* , !dbg !5464 %120 = getelementptr inbounds float, float addrspace(13)* %arrayptr176, i64 %iv9, !dbg !5464 %arrayref = load float, float addrspace(13)* %120, align 4, !dbg !5464, !tbaa !177, !alias.scope !117, !noalias !120, !llvm.mem.parallel_loop_access !5469 %121 = call fastcc float @julia_gelu_2739(float %arrayref) #91, !dbg !5466, !llvm.mem.parallel_loop_access !5469 %arrayptr54177 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %118, align 16, !dbg !5471, !tbaa !95, !alias.scope !5468, !noalias !314, !llvm.mem.parallel_loop_access !5469, !nonnull !90 %"arrayptr54177'il_phi" = phi float addrspace(13)* , !dbg !5471 %122 = getelementptr inbounds float, float addrspace(13)* %arrayptr54177, i64 %iv9, !dbg !5471 store float %121, float addrspace(13)* %122, align 4, !dbg !5471, !tbaa !177, !alias.scope !117, !noalias !5228, !llvm.mem.parallel_loop_access !5469 %exitcond.not = icmp eq i64 %iv.next10, %116, !dbg !5473 br i1 %exitcond.not, label %L616.loopexit, label %L552, !dbg !5460, !llvm.loop !5470 L616.loopexit: ; preds = %L552 br label %L616 L616: ; preds = %L616.loopexit, %L503, %L387, %top call void @llvm.lifetime.end.p0i8(i64 noundef 56, i8* noundef nonnull %.sub_replacementA) #91 br label %invertL616, !dbg !5474 allocsForInversion: ; No predecessors! %"iv'ac" = alloca i64, align 8 %"iv1'ac" = alloca i64, align 8 %"iv3'ac" = alloca i64, align 8 %"iv5'ac" = alloca i64, align 8 %"iv7'ac" = alloca i64, align 8 %"iv9'ac" = alloca i64, align 8 inverttop: ; preds = %invertL6 fence syncscope("singlethread") seq_cst fence syncscope("singlethread") seq_cst ret void invertL6: ; preds = %invertL14 br label %inverttop invertL14: ; preds = %invertL24 br label %invertL6 invertL24: ; preds = %invertL37 br label %invertL14 invertL37: ; preds = %invertL40 br label %invertL24 invertL40: ; preds = %mergeinvertL40_L61, %incinvertL40 %123 = load i64, i64* %"iv'ac", align 8 %124 = icmp eq i64 %123, 0 %125 = xor i1 %124, true br i1 %124, label %invertL37, label %incinvertL40 incinvertL40: ; preds = %invertL40 %126 = load i64, i64* %"iv'ac", align 8 %127 = add nsw i64 %126, -1 store i64 %127, i64* %"iv'ac", align 8 br label %invertL40 invertL61: ; No predecessors! br label %mergeinvertL40_L61 mergeinvertL40_L61: ; preds = %invertL61 store i64 0, i64* %"iv'ac", align 8 br label %invertL40 invertL72: ; No predecessors! %128 = call i64 @llvm.smin.i64(i64 %unbox, i64 %4) #91, !dbg !5077 %_unwrap = trunc i64 %128 to i32 %_unwrap24 = add i32 %_unwrap, -1 invertL133.lr.ph: ; No predecessors! invertL133: ; No predecessors! invertL184: ; No predecessors! invertL187: ; No predecessors! invertL189: ; No predecessors! invertL301.preheader: ; No predecessors! invertL301: ; No predecessors! invertL349.loopexit: ; No predecessors! invertL349: ; No predecessors! invertL355.preheader: ; No predecessors! invertL355: ; No predecessors! invertL375.preheader: ; No predecessors! invertL375: ; No predecessors! invertL379: ; No predecessors! invertL382: ; No predecessors! invertL385.loopexit: ; No predecessors! invertL385: ; No predecessors! invertL387.loopexit: ; No predecessors! invertL387: ; No predecessors! invertL393: ; No predecessors! invertL461: ; No predecessors! invertL464: ; No predecessors! invertL500: ; No predecessors! invertL503: ; No predecessors! invertL552.lr.ph: ; No predecessors! invertL552: ; No predecessors! invertL616.loopexit: ; No predecessors! invertL616: ; preds = %L616 } %v.i_replacementA = phi i64 , !dbg !146 julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:3791: bool GradientUtils::legalRecompute(const llvm::Value*, const ValueToValueMapTy&, llvm::IRBuilder<>*, bool, bool) const: Assertion `phi->getNumIncomingValues() != 0' failed. [840325] signal (6.-6): Aborted in expression starting at REPL[10]:1 unknown function (ip: 0x72699a25b32c) gsignal at /usr/lib/libc.so.6 (unknown line) abort at /usr/lib/libc.so.6 (unknown line) unknown function (ip: 0x72699a1f23db) __assert_fail at /usr/lib/libc.so.6 (unknown line) legalRecompute at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:3791 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6535 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:1327 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:930 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:1066 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:1088 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 branchToCorrespondingTarget at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:7738 createInvertedTerminator at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:3611 CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:4382 recursivelyHandleSubfunction at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:5744 visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:6611 visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:111 [inlined] CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:4378 EnzymeCreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/CApi.cpp:615 EnzymeCreatePrimalAndGradient at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/api.jl:154 unknown function (ip: 0x72696410805b) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 enzyme! at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:3147 unknown function (ip: 0x726964103918) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 #codegen#487 at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5022 codegen at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:4444 [inlined] _thunk at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5707 _thunk at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5707 [inlined] cached_compilation at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5741 [inlined] #532 at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5807 #JuliaContext#149 at /home/avikpal/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:52 unknown function (ip: 0x726964d58b36) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 JuliaContext at /home/avikpal/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:42 #s1946#531 at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5759 [inlined] #s1946#531 at ./none:0 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 GeneratedFunctionStub at ./boot.jl:602 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_call_staged at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/method.c:540 ijl_code_for_staged at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/method.c:593 get_staged at ./compiler/utilities.jl:123 retrieve_code_info at ./compiler/utilities.jl:135 [inlined] InferenceState at ./compiler/inferencestate.jl:430 typeinf_edge at ./compiler/typeinfer.jl:920 abstract_call_method at ./compiler/abstractinterpretation.jl:629 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:95 abstract_call_known at ./compiler/abstractinterpretation.jl:2087 abstract_call at ./compiler/abstractinterpretation.jl:2169 abstract_call at ./compiler/abstractinterpretation.jl:2162 abstract_call at ./compiler/abstractinterpretation.jl:2354 abstract_eval_call at ./compiler/abstractinterpretation.jl:2370 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2380 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2624 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:2889 typeinf_local at ./compiler/abstractinterpretation.jl:3098 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3186 _typeinf at ./compiler/typeinfer.jl:247 typeinf at ./compiler/typeinfer.jl:216 typeinf_edge at ./compiler/typeinfer.jl:930 abstract_call_method at ./compiler/abstractinterpretation.jl:629 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:95 abstract_call_known at ./compiler/abstractinterpretation.jl:2087 abstract_call at ./compiler/abstractinterpretation.jl:2169 abstract_apply at ./compiler/abstractinterpretation.jl:1612 abstract_call_known at ./compiler/abstractinterpretation.jl:2004 abstract_call at ./compiler/abstractinterpretation.jl:2169 abstract_call at ./compiler/abstractinterpretation.jl:2162 abstract_call at ./compiler/abstractinterpretation.jl:2354 abstract_eval_call at ./compiler/abstractinterpretation.jl:2370 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2380 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2624 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:2913 typeinf_local at ./compiler/abstractinterpretation.jl:3098 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3186 _typeinf at ./compiler/typeinfer.jl:247 typeinf at ./compiler/typeinfer.jl:216 typeinf_ext at ./compiler/typeinfer.jl:1051 typeinf_ext_toplevel at ./compiler/typeinfer.jl:1082 typeinf_ext_toplevel at ./compiler/typeinfer.jl:1078 jfptr_typeinf_ext_toplevel_35682.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] jl_type_infer at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:394 jl_generate_fptr_impl at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/jitlayers.cpp:504 jl_compile_method_internal at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2481 [inlined] jl_compile_method_internal at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2368 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2887 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] do_call at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:126 eval_value at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:223 eval_stmt_value at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:174 [inlined] eval_body at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:617 jl_interpret_toplevel_thunk at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:775 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:934 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:877 eval_body at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:579 eval_body at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:544 jl_interpret_toplevel_thunk at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:775 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:934 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:877 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:877 ijl_toplevel_eval_in at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:985 eval at ./boot.jl:385 [inlined] eval_user_input at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:150 repl_backend_loop at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:246 #start_repl_backend#46 at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231 start_repl_backend at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:228 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 #run_repl#59 at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:389 run_repl at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:375 jfptr_run_repl_91734.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 #1013 at ./client.jl:432 jfptr_YY.1013_82700.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] jl_f__call_latest at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/builtins.c:812 #invokelatest#2 at ./essentials.jl:892 [inlined] invokelatest at ./essentials.jl:889 [inlined] run_main_repl at ./client.jl:416 exec_options at ./client.jl:333 _start at ./client.jl:552 jfptr__start_82726.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] true_main at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/jlapi.c:582 jl_repl_entrypoint at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/jlapi.c:731 main at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/cli/loader_exe.c:58 unknown function (ip: 0x72699a1f3ccf) __libc_start_main at /usr/lib/libc.so.6 (unknown line) unknown function (ip: 0x4010b8) Allocations: 60085108 (Pool: 60011990; Big: 73118); GC: 58 [1] + 840325 IOT instruction (core dumped) julia --threads=auto --project=. ```
wsmoses commented 6 months ago

@avik-pal are you able to reduce out the Dense into more fundamental function calls and still get this to trigger?

avik-pal commented 6 months ago

using LuxLib, Enzyme

y = randn(Float32, 10, 10)
b = randn(Float32, 10)
act = gelu

function loss_function(act, y, b)
    return sum(LuxLib.__apply_bias_activation!!(act, y, b, Val(false)))
end

loss_function(act, y, b)

begin
    dy = Enzyme.make_zero(y)
    db = Enzyme.make_zero(b)

    Enzyme.autodiff(Enzyme.Reverse, loss_function, Active, Const(act),
        Duplicated(y, dy), Duplicated(b, db))
end

Now I get

ERROR: LLVM error: function failed verification (4)

without the julia crash

wsmoses commented 6 months ago

For better or worse that's a different error, I'll try to repo the last one and see if it's a quick fix. For both errors can you include the full logs?

avik-pal commented 6 months ago
using Enzyme, Polyester

y = randn(Float32, 10, 10)
b = randn(Float32, 10)
act = x -> max(x, 0)

function __apply_bias_activation!!(σ::F, x, bias::Union{Nothing, AbstractArray}) where {F}
    f_fused = σ ∘ +
    if maximum(length, (x, bias)) > 100_000
        bc = Broadcast.instantiate(Broadcast.broadcasted(f_fused, x, bias))
        @batch for I in eachindex(bc)
            @inbounds x[I] = bc[I]
        end
    else
        @. x = f_fused(x, bias)
    end
    return x
    # return LuxLib.__nonuniform_fast_broadcast!(σ ∘ +, x, bias)
end

function loss_function(act, y, b)
    return sum(__apply_bias_activation!!(act, y, b))
end

loss_function(act, y, b)

begin
    dy = Enzyme.make_zero(y)
    db = Enzyme.make_zero(b)

    Enzyme.autodiff(Enzyme.Reverse, loss_function, Active,
        Const(act), Duplicated(y, dy), Duplicated(b, db))
end

A minimal version without Lux deps

Crash Log ``` Function Attrs: mustprogress willreturn define internal fastcc void @preprocess_julia___apply_bias_activation___2450({} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="137517761287184" "enzymejl_parmtype_ref"="2" %0, {} addrspace(10)* noundef nonnull align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="137517789463248" "enzymejl_parmtype_ref"="2" %1) unnamed_addr #74 !dbg !3125 { top: %2 = call noalias nonnull dereferenceable(88) dereferenceable_or_null(88) i8* @malloc(i64 88), !enzyme_fromstack !483 %3 = bitcast i8* %2 to { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }*, !enzyme_caststack !63 %.sub = bitcast { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3 to i8* %4 = call {}*** @julia.get_pgcstack() #78 %current_task1530 = getelementptr inbounds {}**, {}*** %4, i64 -14 %current_task1 = bitcast {}*** %current_task1530 to {}** %ptls_field531 = getelementptr inbounds {}**, {}*** %4, i64 2 %5 = bitcast {}*** %ptls_field531 to i64*** %ptls_load532533 = load i64**, i64*** %5, align 8, !tbaa !64 %6 = getelementptr inbounds i64*, i64** %ptls_load532533, i64 2 %safepoint = load i64*, i64** %6, align 8, !tbaa !68 fence syncscope("singlethread") seq_cst call void @julia.safepoint(i64* %safepoint) #78, !dbg !3126 fence syncscope("singlethread") seq_cst %7 = addrspacecast {} addrspace(10)* %0 to {} addrspace(11)*, !dbg !3127 %8 = addrspacecast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3127 %arraylen_ptr = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %8, i64 0, i32 1, !dbg !3127 %arraylen = load i64, i64 addrspace(11)* %arraylen_ptr, align 8, !dbg !3127, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %9 = addrspacecast {} addrspace(10)* %1 to {} addrspace(11)*, !dbg !3140 %10 = addrspacecast {} addrspace(10)* %1 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3140 %arraylen_ptr2 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %10, i64 0, i32 1, !dbg !3140 %arraylen3 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3140, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %11 = call i64 @llvm.umax.i64(i64 %arraylen3, i64 %arraylen) #78, !dbg !3143 %12 = icmp ult i64 %11, 100001, !dbg !3146 br i1 %12, label %L811, label %L7, !dbg !3139 L7: ; preds = %top %13 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !3148 %arraysize_ptr = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %13, i64 3, !dbg !3148 %14 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr to i64 addrspace(11)*, !dbg !3148 %arraysize = load i64, i64 addrspace(11)* %14, align 8, !dbg !3148, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraysize_ptr4 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %13, i64 4, !dbg !3148 %15 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr4 to i64 addrspace(11)*, !dbg !3148 %arraysize5 = load i64, i64 addrspace(11)* %15, align 16, !dbg !3148, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %16 = icmp eq i64 %arraylen3, %arraysize, !dbg !3153 %17 = icmp eq i64 %arraysize, 1, !dbg !3155 %value_phi = or i1 %16, %17, !dbg !3155 br i1 %value_phi, label %L40, label %L28, !dbg !3156 L28: ; preds = %L7 %.not559 = icmp eq i64 %arraylen3, 1, !dbg !3155 br i1 %.not559, label %L40, label %L36, !dbg !3156 L36: ; preds = %L28 %18 = call noalias nonnull "enzyme_inactive" {} addrspace(10)* @ijl_box_int64(i64 signext %arraysize) #79, !dbg !3156 %19 = call noalias nonnull "enzyme_inactive" {} addrspace(10)* @ijl_box_int64(i64 signext %arraylen3) #79, !dbg !3156 %20 = call nonnull {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)*, {} addrspace(10)*, {} addrspace(10)*, ...) @julia.call2({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32, {} addrspace(10)*)* noundef nonnull @ijl_invoke, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 137517617037136 to {}*) to {} addrspace(10)*), {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 137517584367872 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 137517712977280 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %18, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 137517712977248 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %19) #80, !dbg !3156 %box = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 137517631542736 to {}*) to {} addrspace(10)*)) #81, !dbg !3156 %21 = bitcast {} addrspace(10)* %box to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !3156 %22 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %21, i64 0, i64 0, !dbg !3156 store {} addrspace(10)* %20, {} addrspace(10)* addrspace(10)* %22, align 8, !dbg !3156, !tbaa !188, !alias.scope !83, !noalias !3159 %23 = addrspacecast {} addrspace(10)* %box to {} addrspace(12)*, !dbg !3156 call void @ijl_throw({} addrspace(12)* %23) #82, !dbg !3156 unreachable, !dbg !3156 L40: ; preds = %L28, %L7 %.sroa.0449.0 = phi i64 [ %arraylen3, %L7 ], [ %arraysize, %L28 ] %24 = call i64 @julia_nthreads_2651() #78, !dbg !3162 %.not = icmp eq i64 %24, 1, !dbg !3164 br i1 %.not, label %L62, label %L221, !dbg !3165 L62: ; preds = %L40 %25 = icmp ne i64 %.sroa.0449.0, 0, !dbg !3166 %26 = icmp ne i64 %arraysize5, 0, !dbg !3166 %.demorgan = and i1 %26, %25, !dbg !3170 br i1 %.demorgan, label %guard_exit374, label %L1042, !dbg !3170 L86: ; preds = %guard_exit379, %guard_exit374 %iv17 = phi i64 [ %iv.next18, %guard_exit379 ], [ 0, %guard_exit374 ], !dbg !3171 %arraylen64 = phi i64 [ %arraylen3, %guard_exit374 ], [ %arraylen64.pre, %guard_exit379 ], !dbg !3171 %nodecayed.arrayptr = phi {} addrspace(10)* [ %237, %guard_exit374 ], [ %240, %guard_exit379 ], !dbg !3180 %arraysize54 = phi i64 [ %arraysize5, %guard_exit374 ], [ %arraysize54.pre, %guard_exit379 ], !dbg !3183 %arraysize62 = phi i64 [ %arraysize, %guard_exit374 ], [ %arraysize72, %guard_exit379 ], !dbg !3180 %value_phi40 = phi i64 [ 1, %guard_exit374 ], [ %value_phi77572, %guard_exit379 ] %value_phi41 = phi i64 [ 1, %guard_exit374 ], [ %value_phi78573, %guard_exit379 ] %iv.next18 = add nuw nsw i64 %iv17, 1, !dbg !3186 %27 = bitcast {} addrspace(10)* %nodecayed.arrayptr to i8 addrspace(13)* addrspace(10)*, !dbg !3186 %28 = addrspacecast i8 addrspace(13)* addrspace(10)* %27 to i8 addrspace(13)* addrspace(11)*, !dbg !3186 %29 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %28, align 8, !dbg !3186 %.not534 = icmp eq i64 %arraysize62, 1, !dbg !3186 %.not535 = icmp eq i64 %arraysize54, 1, !dbg !3188 %value_phi40.op = add i64 %value_phi40, -1, !dbg !3180 %30 = select i1 %.not534, i64 0, i64 %value_phi40.op, !dbg !3180 %value_phi41.op = add i64 %value_phi41, -1, !dbg !3180 %31 = select i1 %.not535, i64 0, i64 %value_phi41.op, !dbg !3180 %32 = mul i64 %31, %arraysize62, !dbg !3180 %33 = add i64 %32, %30, !dbg !3180 %34 = bitcast i8 addrspace(13)* %29 to float addrspace(13)*, !dbg !3180 %35 = getelementptr inbounds float, float addrspace(13)* %34, i64 %33, !dbg !3180 %arrayref = load float, float addrspace(13)* %35, align 4, !dbg !3180, !tbaa !494, !alias.scope !83, !noalias !86 %.not536 = icmp eq i64 %arraylen64, 1, !dbg !3190 %36 = select i1 %.not536, i64 0, i64 %value_phi40.op, !dbg !3192 %arrayptr69538 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %234, align 16, !dbg !3192, !tbaa !271, !alias.scope !3194, !noalias !255, !nonnull !63 %37 = getelementptr inbounds float, float addrspace(13)* %arrayptr69538, i64 %36, !dbg !3192 %arrayref70 = load float, float addrspace(13)* %37, align 4, !dbg !3192, !tbaa !494, !alias.scope !83, !noalias !86 %38 = fadd float %arrayref, %arrayref70, !dbg !3195 %39 = call fastcc float @julia_gelu_2643(float %38) #78, !dbg !3197 %arraysize72 = load i64, i64 addrspace(11)* %14, align 8, !dbg !3202, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %40 = mul i64 %arraysize72, %value_phi41.op, !dbg !3202 %41 = add i64 %40, %value_phi40.op, !dbg !3202 %arrayptr75 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !3202, !tbaa !68, !alias.scope !3204, !noalias !337, !nonnull !63 %42 = bitcast i8 addrspace(13)* %arrayptr75 to float addrspace(13)*, !dbg !3202 %43 = getelementptr inbounds float, float addrspace(13)* %42, i64 %41, !dbg !3202 store float %39, float addrspace(13)* %43, align 4, !dbg !3202, !tbaa !494, !alias.scope !83, !noalias !3159 %44 = add i64 %value_phi40, 1, !dbg !3205 %45 = icmp ugt i64 %value_phi40, 9223372036854775806, !dbg !3208 %46 = icmp sgt i64 %44, %.sroa.0449.0, !dbg !3208 %47 = or i1 %45, %46, !dbg !3211 %48 = icmp eq i64 %value_phi40, %.sroa.0449.0 %or.cond = or i1 %48, %47, !dbg !3211 br i1 %or.cond, label %L194, label %guard_exit379, !dbg !3211 L194: ; preds = %L86 %49 = add i64 %value_phi41, 1, !dbg !3212 %50 = icmp ult i64 %value_phi41, 9223372036854775807, !dbg !3215 %51 = icmp sle i64 %49, %arraysize5, !dbg !3215 %52 = and i1 %50, %51, !dbg !3219 %53 = icmp ne i64 %value_phi41, %arraysize5, !dbg !3218 %value_phi95 = and i1 %53, %52, !dbg !3218 br i1 %value_phi95, label %guard_exit379, label %L1042.loopexit1, !dbg !3170 L221: ; preds = %L40 %.not540 = icmp eq i64 %arraysize5, 0, !dbg !3220 br i1 %.not540, label %L1042, label %L227, !dbg !3222 L227: ; preds = %L221 %54 = call i64 @llvm.smin.i64(i64 %24, i64 %arraysize5) #78, !dbg !3224 %.not541 = icmp eq i64 %54, 0, !dbg !3226 br i1 %.not541, label %L691.lr.ph, label %L235, !dbg !3227 L235: ; preds = %L227 %55 = trunc i64 %54 to i32, !dbg !3228 %56 = add i32 %55, -1, !dbg !3228 %57 = call nonnull "enzyme_inactive" {}* @julia.pointer_from_objref({} addrspace(11)* noundef addrspacecast ({}* inttoptr (i64 137517510837008 to {}*) to {} addrspace(11)*)) #83, !dbg !3232 %58 = icmp sgt i32 %56, 0, !dbg !3234 br i1 %58, label %L245, label %L691.lr.ph, !dbg !3235 L245: ; preds = %L235 %p.i = bitcast {}* %57 to i64*, !dbg !3237 %v.i = atomicrmw xchg i64* %p.i, i64 0 acq_rel, align 8, !dbg !3237 %59 = call i64 @llvm.ctpop.i64(i64 %v.i) #78, !dbg !3240, !range !2055 %60 = trunc i64 %59 to i32, !dbg !3242 %61 = sub nsw i32 %56, %60, !dbg !3243 %62 = icmp slt i32 %61, 0, !dbg !3245 br i1 %62, label %L258, label %L293, !dbg !3248 L258: ; preds = %L245 %63 = call i64 @llvm.ctlz.i64(i64 %v.i, i1 noundef false) #78, !dbg !3249, !range !2055 %64 = trunc i64 %63 to i32, !dbg !3251 br label %L261, !dbg !3252 L261: ; preds = %L261, %L258 %iv = phi i64 [ %iv.next, %L261 ], [ 0, %L258 ] %value_phi239 = phi i32 [ %64, %L258 ], [ %65, %L261 ] %value_phi240 = phi i32 [ %61, %L258 ], [ %74, %L261 ] %value_phi241 = phi i64 [ %v.i, %L258 ], [ %70, %L261 ] %iv.next = add nuw nsw i64 %iv, 1, !dbg !3257 %65 = sub i32 %value_phi239, %value_phi240, !dbg !3257 %66 = sub i32 64, %65, !dbg !3259 %67 = zext i32 %66 to i64, !dbg !3261 %68 = icmp ugt i32 %66, 63, !dbg !3261 %notmask = shl nsw i64 -1, %67, !dbg !3259 %.op = xor i64 %notmask, -1, !dbg !3259 %69 = select i1 %68, i64 -1, i64 %.op, !dbg !3259 %70 = and i64 %69, %value_phi241, !dbg !3262 %71 = xor i64 %70, %value_phi241, !dbg !3264 %72 = call i64 @llvm.ctpop.i64(i64 %71) #78, !dbg !3265, !range !2055 %73 = trunc i64 %72 to i32, !dbg !3267 %74 = add i32 %value_phi240, %73, !dbg !3268 %.not558 = icmp eq i32 %74, 0, !dbg !3269 br i1 %.not558, label %L282, label %L261, !dbg !3270 L282: ; preds = %L261 %75 = xor i64 %70, -1, !dbg !3271 %76 = and i64 %v.i, %75, !dbg !3273 store atomic i64 %76, i64* %p.i release, align 16, !dbg !3274, !noalias !3275 br label %L293, !dbg !3276 L293: ; preds = %L282, %L245 %value_phi155 = phi i32 [ %56, %L282 ], [ %60, %L245 ] %value_phi156 = phi i64 [ %70, %L282 ], [ %v.i, %L245 ] %77 = icmp sgt i32 %value_phi155, 0, !dbg !3279 br i1 %77, label %L361.lr.ph, label %L691.lr.ph, !dbg !3280 L361.lr.ph: ; preds = %L293 %78 = zext i32 %value_phi155 to i64, !dbg !3281 %79 = add nuw nsw i64 %78, 1, !dbg !3298 %80 = udiv i64 %arraysize5, %79, !dbg !3300 %81 = mul i64 %80, %79, !dbg !3301 %82 = sub i64 %arraysize5, %81, !dbg !3303 %83 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %7) #83, !dbg !3304 %84 = bitcast {}* %83 to i8**, !dbg !3304 %arrayptr159 = load i8*, i8** %84, align 8, !dbg !3304, !tbaa !68, !alias.scope !336, !noalias !337, !nonnull !63 %85 = ptrtoint i8* %arrayptr159 to i64, !dbg !3304 %arraysize161 = load i64, i64 addrspace(11)* %14, align 8, !dbg !3312, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraysize163 = load i64, i64 addrspace(11)* %15, align 16, !dbg !3312, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %86 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %9) #83, !dbg !3318 %87 = bitcast {}* %86 to i8**, !dbg !3318 %arrayptr179 = load i8*, i8** %87, align 8, !dbg !3318, !tbaa !271, !alias.scope !254, !noalias !255, !nonnull !63 %88 = ptrtoint i8* %arrayptr179 to i64, !dbg !3318 %arraylen181 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3328, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %89 = insertvalue [2 x {} addrspace(10)*] zeroinitializer, {} addrspace(10)* %0, 0, !dbg !3334 %90 = insertvalue [2 x {} addrspace(10)*] %89, {} addrspace(10)* %1, 1, !dbg !3334 %newstruct187.sroa.0.0..sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 0, i64 0, i64 0, i64 0, !dbg !3335 store i64 %.sroa.0449.0, i64* %newstruct187.sroa.0.0..sroa_idx, align 16, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.2.sroa.0.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 1, i32 0, !dbg !3335 store i64 %85, i64* %newstruct187.sroa.2.sroa.0.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx, align 8, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.2.sroa.2.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx415 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 1, i32 1, i64 0, !dbg !3335 store i64 %arraysize161, i64* %newstruct187.sroa.2.sroa.2.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx415, align 16, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.2.sroa.3.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx416 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 1, i32 1, i64 1, !dbg !3335 store i64 %arraysize163, i64* %newstruct187.sroa.2.sroa.3.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx416, align 8, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.0.sroa.0.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 0, i32 0, i32 0, !dbg !3335 store i64 %85, i64* %newstruct187.sroa.3.sroa.0.sroa.0.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx, align 16, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.0.sroa.2.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx411 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 0, i32 0, i32 1, i64 0, !dbg !3335 store i64 %arraysize161, i64* %newstruct187.sroa.3.sroa.0.sroa.2.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx411, align 8, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.0.sroa.3.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx412 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 0, i32 0, i32 1, i64 1, !dbg !3335 store i64 %arraysize163, i64* %newstruct187.sroa.3.sroa.0.sroa.3.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx412, align 16, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.2.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx405 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 0, i32 1, i32 0, !dbg !3335 store i64 %88, i64* %newstruct187.sroa.3.sroa.2.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx405, align 8, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.3.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx406 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 0, i32 1, i32 1, i64 0, !dbg !3335 store i64 %arraylen181, i64* %newstruct187.sroa.3.sroa.3.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx406, align 16, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.4.sroa.0.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 1, i64 0, i64 0, !dbg !3335 store i64 %.sroa.0449.0, i64* %newstruct187.sroa.3.sroa.4.sroa.0.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx, align 8, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %newstruct187.sroa.3.sroa.4.sroa.2.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx446 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, i64 0, i32 2, i32 1, i64 1, i64 0, !dbg !3335 store i64 %arraysize5, i64* %newstruct187.sroa.3.sroa.4.sroa.2.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx446, align 16, !dbg !3335, !tbaa !682, !alias.scope !2185, !noalias !3336 %91 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* nonnull %0, [2 x {} addrspace(10)*] %90) #78, !dbg !3311 %92 = icmp sgt i64 %82, -1 br label %L361, !dbg !3337 L427.preheader: ; preds = %L415 %value_phi201587 = add i64 %105, 1, !dbg !3338 %.not549588 = icmp sgt i64 %value_phi201587, %arraysize5, !dbg !3339 br i1 %.not549588, label %L641.preheader, label %L430.lr.ph, !dbg !3340 L430.lr.ph: ; preds = %L427.preheader %93 = icmp eq i64 %.sroa.0449.0, 0 %.not551 = icmp eq i64 %arraysize161, 1 %.not552 = icmp eq i64 %arraysize163, 1 %.not553 = icmp eq i64 %arraylen181, 1 %94 = add nsw i64 %.sroa.0449.0, -1, !dbg !3340 %umin600 = call i64 @llvm.umin.i64(i64 %94, i64 noundef 9223372036854775806) #78, !dbg !3340 %95 = add nuw nsw i64 %umin600, 1 %96 = add i64 %80, %value_phi193592, !dbg !3341 %umin = call i1 @llvm.umin.i1(i1 %102, i1 %92), !dbg !3340 %97 = zext i1 %umin to i64, !dbg !3340 %98 = add i64 %96, %97, !dbg !3341 br label %L430, !dbg !3340 L361: ; preds = %L415, %L361.lr.ph %iv3 = phi i64 [ %iv.next4, %L415 ], [ 0, %L361.lr.ph ] %value_phi195594 = phi i64 [ %value_phi156, %L361.lr.ph ], [ %111, %L415 ] %value_phi193592 = phi i64 [ 0, %L361.lr.ph ], [ %105, %L415 ] %value_phi192591 = phi i32 [ 0, %L361.lr.ph ], [ %107, %L415 ] %iv.next4 = add nuw nsw i64 %iv3, 1, !dbg !3342 %99 = icmp ne i64 %value_phi195594, 0, !dbg !3342 call void @llvm.assume(i1 noundef %99) #78, !dbg !3345 %100 = call i64 @llvm.cttz.i64(i64 %value_phi195594, i1 noundef true) #78, !dbg !3346, !range !2055 %101 = trunc i64 %100 to i32, !dbg !3348 %102 = icmp ugt i64 %82, %iv3, !dbg !3349 %not.ifelse_cond196 = and i1 %92, %102, !dbg !3353 %103 = zext i1 %not.ifelse_cond196 to i64, !dbg !3353 %104 = add i64 %value_phi193592, %80, !dbg !3353 %105 = add i64 %104, %103, !dbg !3354 %106 = add nuw nsw i32 %101, 1, !dbg !3355 %107 = add i32 %106, %value_phi192591, !dbg !3357 %108 = zext i32 %106 to i64, !dbg !3359 %109 = lshr i64 %value_phi195594, %108, !dbg !3359 %110 = icmp eq i32 %101, 63, !dbg !3359 %111 = select i1 %110, i64 0, i64 %109, !dbg !3359 %112 = load i64, i64* inttoptr (i64 137517345406912 to i64*), align 64, !dbg !3361, !tbaa !131, !alias.scope !83, !noalias !86 %113 = shl i32 %107, 9, !dbg !3367 %114 = zext i32 %113 to i64, !dbg !3368 %115 = inttoptr i64 %112 to i8*, !dbg !3372 %116 = getelementptr i8, i8* %115, i64 %114, !dbg !3372 %117 = getelementptr i8, i8* %116, i64 8, !dbg !3373 %coercion = bitcast i8* %117 to i64*, !dbg !3379 store i64 ptrtoint (void (i64)* @jlcapi_BatchClosure_2456 to i64), i64* %coercion, align 1, !dbg !3379, !tbaa !81, !alias.scope !83, !noalias !3159 %118 = getelementptr i8, i8* %116, i64 16, !dbg !3383 %119 = bitcast i8* %118 to { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }**, !dbg !3387 store { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %3, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }** %119, align 1, !dbg !3387, !tbaa !81, !alias.scope !83, !noalias !3159 %120 = getelementptr i8, i8* %116, i64 24, !dbg !3391 %coercion198 = bitcast i8* %120 to i64*, !dbg !3395 store i64 %value_phi193592, i64* %coercion198, align 1, !dbg !3395, !tbaa !81, !alias.scope !83, !noalias !3159 %121 = getelementptr i8, i8* %116, i64 32, !dbg !3399 %coercion199 = bitcast i8* %121 to i64*, !dbg !3403 store i64 %105, i64* %coercion199, align 1, !dbg !3403, !tbaa !81, !alias.scope !83, !noalias !3159 %p.i386 = bitcast i8* %116 to i32*, !dbg !3407 %v.i387 = atomicrmw xchg i32* %p.i386, i32 0 acq_rel, align 4, !dbg !3407 %.not548 = icmp eq i32 %v.i387, 1, !dbg !3410 br i1 %.not548, label %L412, label %L415, !dbg !3411 L412: ; preds = %L361 call fastcc void @julia_wake_thread__2634(i32 zeroext %107) #78, !dbg !3411 br label %L415, !dbg !3411 L415: ; preds = %L412, %L361 %122 = icmp eq i64 %iv.next4, %78, !dbg !3412 br i1 %122, label %L427.preheader, label %L361, !dbg !3337 L641.preheader.loopexit: ; preds = %L622 br label %L641.preheader, !dbg !3414 L641.preheader: ; preds = %L641.preheader.loopexit, %L427.preheader %123 = icmp eq i64 %value_phi156, 0, !dbg !3414 br i1 %123, label %L678, label %L646.preheader, !dbg !3416 L646.preheader: ; preds = %L641.preheader br label %L646, !dbg !3417 L430: ; preds = %L622, %L430.lr.ph %iv5 = phi i64 [ %iv.next6, %L622 ], [ 0, %L430.lr.ph ] %124 = add i64 %98, %iv5, !dbg !3341 %iv.next6 = add nuw nsw i64 %iv5, 1, !dbg !3341 %125 = add i64 %value_phi201587, %iv5, !dbg !3341 br i1 %93, label %L622, label %L442.preheader, !dbg !3341 L442.preheader: ; preds = %L430 %126 = select i1 %.not552, i64 0, i64 %124 %127 = mul i64 %126, %arraysize161 %128 = mul i64 %124, %arraysize161 br label %L442, !dbg !3253 L442: ; preds = %L442, %L442.preheader %iv7 = phi i64 [ %iv.next8, %L442 ], [ 0, %L442.preheader ] %iv.next8 = add nuw nsw i64 %iv7, 1, !dbg !3420 %129 = select i1 %.not551, i64 1, i64 %iv.next8, !dbg !3420 %130 = add i64 %129, %127, !dbg !3429 %131 = shl i64 %130, 2, !dbg !3437 %132 = add i64 %131, -4, !dbg !3437 %133 = getelementptr i8, i8* %arrayptr159, i64 %132, !dbg !3440 %coercion216 = bitcast i8* %133 to float*, !dbg !3441 %pointerref = load float, float* %coercion216, align 1, !dbg !3441, !tbaa !81, !alias.scope !83, !noalias !86 %value_phi206.op = shl i64 %iv.next8, 2, !dbg !3445 %value_phi206.op.op = add i64 %value_phi206.op, -4, !dbg !3445 %134 = select i1 %.not553, i64 0, i64 %value_phi206.op.op, !dbg !3445 %135 = getelementptr i8, i8* %arrayptr179, i64 %134, !dbg !3452 %coercion219 = bitcast i8* %135 to float*, !dbg !3453 %pointerref220 = load float, float* %coercion219, align 1, !dbg !3453, !tbaa !81, !alias.scope !83, !noalias !86 %136 = fadd float %pointerref, %pointerref220, !dbg !3457 call void @llvm.lifetime.end.p0i8(i64 noundef 88, i8* noundef nonnull %.sub) #78 %137 = call fastcc float @julia_gelu_2643(float %136) #78, !dbg !3459 %138 = add i64 %iv.next8, %128, !dbg !3464 %139 = shl i64 %138, 2, !dbg !3472 %140 = add i64 %139, -4, !dbg !3472 %141 = getelementptr i8, i8* %arrayptr159, i64 %140, !dbg !3475 %coercion222 = bitcast i8* %141 to float*, !dbg !3476 store float %137, float* %coercion222, align 1, !dbg !3476, !tbaa !81, !alias.scope !83, !noalias !3159 %142 = add nuw nsw i64 %iv.next8, 1, !dbg !3480 %exitcond601.not = icmp eq i64 %iv.next8, %95, !dbg !3483 br i1 %exitcond601.not, label %L622.loopexit, label %L442, !dbg !3253 L622.loopexit: ; preds = %L442 br label %L622, !dbg !3338 L622: ; preds = %L622.loopexit, %L430 %value_phi201 = add i64 %125, 1, !dbg !3338 %exitcond602 = icmp eq i64 %125, %arraysize5, !dbg !3339 br i1 %exitcond602, label %L641.preheader.loopexit, label %L430, !dbg !3340 L646: ; preds = %L646.preheader, %L676 %iv9 = phi i64 [ 0, %L646.preheader ], [ %iv.next10, %L676 ] %value_phi236586 = phi i64 [ %147, %L676 ], [ %value_phi156, %L646.preheader ] %value_phi235585 = phi i32 [ %149, %L676 ], [ 0, %L646.preheader ] %iv.next10 = add nuw nsw i64 %iv9, 1, !dbg !3484 %143 = call i64 @llvm.cttz.i64(i64 %value_phi236586, i1 noundef true) #78, !dbg !3484, !range !2055 %144 = trunc i64 %143 to i32, !dbg !3486 %145 = add nuw nsw i32 %144, 1, !dbg !3487 %146 = zext i32 %145 to i64, !dbg !3489 %147 = lshr i64 %value_phi236586, %146, !dbg !3489 %148 = icmp eq i32 %144, 63, !dbg !3489 %149 = add i32 %145, %value_phi235585, !dbg !3491 %150 = load i64, i64* inttoptr (i64 137517345406912 to i64*), align 64, !dbg !3493, !tbaa !131, !alias.scope !83, !noalias !86 %151 = shl i32 %149, 9, !dbg !3496 %152 = zext i32 %151 to i64, !dbg !3497 %153 = inttoptr i64 %150 to i8*, !dbg !3501 %154 = getelementptr i8, i8* %153, i64 %152, !dbg !3501 %p.i388 = bitcast i8* %154 to i32*, !dbg !3502 %v.i389582 = load atomic i32, i32* %p.i388 acquire, align 16, !dbg !3502 %.not556583 = icmp eq i32 %v.i389582, 0, !dbg !3504 br i1 %.not556583, label %L666.preheader, label %L676, !dbg !3417 L666.preheader: ; preds = %L646 br label %L666, !dbg !3505 L666: ; preds = %L666.preheader, %L673 %iv11 = phi i64 [ 0, %L666.preheader ], [ %iv.next12, %L673 ] %155 = trunc i64 %iv11 to i32 %iv.next12 = add nuw nsw i64 %iv11, 1 call void @llvm.lifetime.end.p0i8(i64 noundef 88, i8* noundef nonnull %.sub) #78 call void asm sideeffect "pause", "~{memory}"() #84, !dbg !3506 %156 = add i32 %155, 1, !dbg !3508 %157 = icmp ult i32 %156, 65537, !dbg !3509 br i1 %157, label %L673, label %L670, !dbg !3505 L670: ; preds = %L666 %158 = call fastcc i8 @julia_checktask_2476(i32 zeroext %149) #78, !dbg !3511 %159 = and i8 %158, 1, !dbg !3511 %.not557 = icmp eq i8 %159, 0, !dbg !3511 br i1 %.not557, label %L673, label %L676.loopexit, !dbg !3511 L673: ; preds = %L670, %L666 %v.i389 = load atomic i32, i32* %p.i388 acquire, align 16, !dbg !3502 %.not556 = icmp eq i32 %v.i389, 0, !dbg !3504 br i1 %.not556, label %L666, label %L676.loopexit, !dbg !3417 L676.loopexit: ; preds = %L670, %L673 br label %L676, !dbg !3414 L676: ; preds = %L676.loopexit, %L646 %160 = icmp eq i64 %147, 0, !dbg !3414 %161 = select i1 %148, i1 true, i1 %160, !dbg !3414 br i1 %161, label %L678.loopexit, label %L646, !dbg !3416 L678.loopexit: ; preds = %L676 br label %L678, !dbg !3512 L678: ; preds = %L678.loopexit, %L641.preheader %v.i391 = atomicrmw or i64* %p.i, i64 %value_phi156 acq_rel, align 8, !dbg !3512 br label %L1042, !dbg !3515 L691.lr.ph: ; preds = %L293, %L235, %L227 %162 = icmp eq i64 %.sroa.0449.0, 0 %arrayptr_ptr129.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %8, i64 0, i32 0 %163 = addrspacecast {} addrspace(10)* %1 to float addrspace(13)* addrspace(11)* %164 = add nsw i64 %.sroa.0449.0, -1, !dbg !3516 %umin597 = call i64 @llvm.umin.i64(i64 %164, i64 noundef 9223372036854775806) #78, !dbg !3516 %165 = call i64 @llvm.smax.i64(i64 %arraysize5, i64 noundef 1) #78, !dbg !3516 %166 = add nuw nsw i64 %umin597, 1 br label %L691, !dbg !3516 L691: ; preds = %L804, %L691.lr.ph %iv13 = phi i64 [ %iv.next14, %L804 ], [ 0, %L691.lr.ph ] %iv.next14 = add nuw nsw i64 %iv13, 1, !dbg !3517 br i1 %162, label %L804, label %L691.L703_crit_edge, !dbg !3517 L691.L703_crit_edge: ; preds = %L691 %arraysize117.pre = load i64, i64 addrspace(11)* %14, align 8, !dbg !3518, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arrayptr130.pre = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr129.phi.trans.insert, align 16, !dbg !3527, !tbaa !68, !alias.scope !3204, !noalias !337 %value_phi106.op = add nsw i64 %iv.next14, -1 %167 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !3517 %168 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %167 to i8 addrspace(13)* addrspace(10)*, !dbg !3517 %169 = bitcast i8 addrspace(13)* addrspace(10)* %168 to {} addrspace(10)*, !dbg !3517 br label %L703, !dbg !3517 L703: ; preds = %L703, %L691.L703_crit_edge %iv15 = phi i64 [ %iv.next16, %L703 ], [ 0, %L691.L703_crit_edge ], !dbg !3527 %nodecayed.arrayptr130 = phi {} addrspace(10)* [ %169, %L691.L703_crit_edge ], [ %190, %L703 ], !dbg !3527 %arraysize127 = phi i64 [ %arraysize117.pre, %L691.L703_crit_edge ], [ %arraysize141, %L703 ], !dbg !3527 %iv.next16 = add nuw nsw i64 %iv15, 1, !dbg !3518 %170 = bitcast {} addrspace(10)* %nodecayed.arrayptr130 to i8 addrspace(13)* addrspace(10)*, !dbg !3518 %171 = addrspacecast i8 addrspace(13)* addrspace(10)* %170 to i8 addrspace(13)* addrspace(11)*, !dbg !3518 %172 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %171, align 8, !dbg !3518 %arraysize119 = load i64, i64 addrspace(11)* %15, align 16, !dbg !3518, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %.not543 = icmp eq i64 %arraysize127, 1, !dbg !3529 %.not544 = icmp eq i64 %arraysize119, 1, !dbg !3531 %value_phi111.op = add nsw i64 %iv.next16, -1, !dbg !3527 %173 = select i1 %.not543, i64 0, i64 %value_phi111.op, !dbg !3527 %174 = select i1 %.not544, i64 0, i64 %value_phi106.op, !dbg !3527 %175 = mul i64 %174, %arraysize127, !dbg !3527 %176 = add i64 %175, %173, !dbg !3527 %177 = bitcast i8 addrspace(13)* %172 to float addrspace(13)*, !dbg !3527 %178 = getelementptr inbounds float, float addrspace(13)* %177, i64 %176, !dbg !3527 %arrayref131 = load float, float addrspace(13)* %178, align 4, !dbg !3527, !tbaa !494, !alias.scope !83, !noalias !86 %arraylen133 = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3533, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %.not545 = icmp eq i64 %arraylen133, 1, !dbg !3538 %179 = select i1 %.not545, i64 0, i64 %value_phi111.op, !dbg !3540 %arrayptr138547 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %163, align 16, !dbg !3540, !tbaa !271, !alias.scope !3194, !noalias !255, !nonnull !63 %180 = getelementptr inbounds float, float addrspace(13)* %arrayptr138547, i64 %179, !dbg !3540 %arrayref139 = load float, float addrspace(13)* %180, align 4, !dbg !3540, !tbaa !494, !alias.scope !83, !noalias !86 %181 = fadd float %arrayref131, %arrayref139, !dbg !3542 %182 = call fastcc float @julia_gelu_2643(float %181) #78, !dbg !3544 %arraysize141 = load i64, i64 addrspace(11)* %14, align 8, !dbg !3549, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %183 = mul i64 %arraysize141, %value_phi106.op, !dbg !3549 %184 = add i64 %183, %value_phi111.op, !dbg !3549 %arrayptr144 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr129.phi.trans.insert, align 16, !dbg !3549, !tbaa !68, !alias.scope !3204, !noalias !337, !nonnull !63 %185 = bitcast i8 addrspace(13)* %arrayptr144 to float addrspace(13)*, !dbg !3549 %186 = getelementptr inbounds float, float addrspace(13)* %185, i64 %184, !dbg !3549 store float %182, float addrspace(13)* %186, align 4, !dbg !3549, !tbaa !494, !alias.scope !83, !noalias !3159 %187 = add nuw nsw i64 %iv.next16, 1, !dbg !3551 %exitcond598.not = icmp eq i64 %iv.next16, %166, !dbg !3554 %188 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !3277 %189 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %188 to i8 addrspace(13)* addrspace(10)*, !dbg !3277 %190 = bitcast i8 addrspace(13)* addrspace(10)* %189 to {} addrspace(10)*, !dbg !3277 br i1 %exitcond598.not, label %L804.loopexit, label %L703, !dbg !3277 L804.loopexit: ; preds = %L703 br label %L804, !dbg !3555 L804: ; preds = %L804.loopexit, %L691 %191 = add nuw nsw i64 %iv.next14, 1, !dbg !3555 %exitcond599 = icmp eq i64 %iv.next14, %165, !dbg !3558 br i1 %exitcond599, label %L1042.loopexit2, label %L691, !dbg !3516 L811: ; preds = %top %192 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !3559 %arraysize_ptr246 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %192, i64 3, !dbg !3559 %193 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr246 to i64 addrspace(11)*, !dbg !3559 %arraysize247 = load i64, i64 addrspace(11)* %193, align 8, !dbg !3559, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraysize_ptr248 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %192, i64 4, !dbg !3559 %194 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr248 to i64 addrspace(11)*, !dbg !3559 %arraysize249 = load i64, i64 addrspace(11)* %194, align 16, !dbg !3559, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %195 = icmp eq i64 %arraysize247, 1, !dbg !3564 %196 = icmp eq i64 %arraysize249, 1, !dbg !3569 %197 = icmp ne i64 %arraysize247, %arraylen3, !dbg !3572 %198 = icmp ne i64 %arraylen3, 1, !dbg !3574 %199 = and i1 %198, %197, !dbg !3575 br i1 %199, label %L860, label %L902, !dbg !3575 L860: ; preds = %L811 call fastcc void @julia_DimensionMismatch_2470() #78, !dbg !3575 %box348 = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 137517631542736 to {}*) to {} addrspace(10)*)) #81, !dbg !3575 %200 = bitcast {} addrspace(10)* %box348 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !3575 %201 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %200, i64 0, i64 0, !dbg !3575 store {} addrspace(10)* addrspacecast ({}* inttoptr (i64 137517709292320 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspace(10)* %201, align 8, !dbg !3575, !tbaa !188, !alias.scope !83, !noalias !3159 %202 = addrspacecast {} addrspace(10)* %box348 to {} addrspace(12)*, !dbg !3575 call void @ijl_throw({} addrspace(12)* %202) #82, !dbg !3575 unreachable, !dbg !3575 L902: ; preds = %L811 %203 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %7) #83, !dbg !3578 %204 = bitcast {}* %203 to i8**, !dbg !3578 %arrayptr286 = load i8*, i8** %204, align 8, !dbg !3578, !tbaa !68, !alias.scope !336, !noalias !337, !nonnull !63 %205 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %9) #83, !dbg !3578 %206 = bitcast {}* %205 to i8**, !dbg !3578 %arrayptr288 = load i8*, i8** %206, align 8, !dbg !3578, !tbaa !271, !alias.scope !254, !noalias !255, !nonnull !63 %.not560 = icmp eq i8* %arrayptr286, %arrayptr288, !dbg !3590 %207 = bitcast {} addrspace(10)* %1 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !3582 br i1 %.not560, label %L924, label %L929, !dbg !3582 L924: ; preds = %L902 %208 = call noalias nonnull {} addrspace(10)* @ijl_array_copy({} addrspace(10)* noundef nonnull %1) #78, !dbg !3593 %.phi.trans.insert517 = addrspacecast {} addrspace(10)* %208 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %arraylen_ptr290.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %.phi.trans.insert517, i64 0, i32 1 %arraylen291.pre = load i64, i64 addrspace(11)* %arraylen_ptr290.phi.trans.insert, align 8, !dbg !3595, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %209 = bitcast {} addrspace(10)* %208 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !3252 br label %L929, !dbg !3252 L929: ; preds = %L924, %L902 %nodecayed..pre-phi529 = phi { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* [ %209, %L924 ], [ %207, %L902 ], !dbg !3595 %arraylen291 = phi i64 [ %arraylen291.pre, %L924 ], [ %arraylen3, %L902 ], !dbg !3595 %210 = addrspacecast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %nodecayed..pre-phi529 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !3599 %211 = icmp eq i64 %arraylen291, 1, !dbg !3599 %.not561 = icmp eq i64 %arraysize249, 0, !dbg !3603 br i1 %.not561, label %L1042, label %L954.preheader, !dbg !3607 L954.preheader: ; preds = %L929 %.not562 = icmp eq i64 %arraysize247, 0 %212 = addrspacecast {} addrspace(10)* %0 to float addrspace(13)* addrspace(11)* %213 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %210 to float addrspace(13)* addrspace(11)* br label %L954, !dbg !3609 L954: ; preds = %L1005, %L954.preheader %iv19 = phi i64 [ %iv.next20, %L1005 ], [ 0, %L954.preheader ] %iv.next20 = add nuw nsw i64 %iv19, 1, !dbg !3609 br i1 %.not562, label %L1005, label %L963.lr.ph, !dbg !3609 L963.lr.ph: ; preds = %L954 %value_phi300.op = add nsw i64 %iv.next20, -1 %214 = select i1 %196, i64 0, i64 %value_phi300.op %arraysize309.pre = load i64, i64 addrspace(11)* %193, align 8, !dbg !3610, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arrayptr312564.pre = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %212, align 16, !dbg !3610, !tbaa !68, !alias.scope !3204, !noalias !337 %215 = bitcast {} addrspace(10)* %0 to float addrspace(13)* addrspace(10)*, !dbg !3618 %216 = bitcast float addrspace(13)* addrspace(10)* %215 to {} addrspace(10)*, !dbg !3618 br label %L963, !dbg !3618 L963: ; preds = %L963, %L963.lr.ph %iv21 = phi i64 [ %iv.next22, %L963 ], [ 0, %L963.lr.ph ], !dbg !3610 %nodecayed.arrayptr312564 = phi {} addrspace(10)* [ %216, %L963.lr.ph ], [ %232, %L963 ], !dbg !3610 %arraysize309 = phi i64 [ %arraysize309.pre, %L963.lr.ph ], [ %arraysize319, %L963 ], !dbg !3610 %iv.next22 = add nuw nsw i64 %iv21, 1, !dbg !3619 %217 = bitcast {} addrspace(10)* %nodecayed.arrayptr312564 to float addrspace(13)* addrspace(10)*, !dbg !3619 %218 = addrspacecast float addrspace(13)* addrspace(10)* %217 to float addrspace(13)* addrspace(11)*, !dbg !3619 %219 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %218, align 8, !dbg !3619 %220 = select i1 %195, i64 0, i64 %iv21, !dbg !3610 %221 = mul i64 %arraysize309, %214, !dbg !3610 %222 = add i64 %220, %221, !dbg !3610 %223 = getelementptr inbounds float, float addrspace(13)* %219, i64 %222, !dbg !3610 %arrayref313 = load float, float addrspace(13)* %223, align 4, !dbg !3610, !tbaa !494, !alias.scope !83, !noalias !86 %224 = select i1 %211, i64 0, i64 %iv21, !dbg !3622 %arrayptr316565 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %213, align 8, !dbg !3622, !tbaa !271, !alias.scope !3194, !noalias !255, !nonnull !63 %225 = getelementptr inbounds float, float addrspace(13)* %arrayptr316565, i64 %224, !dbg !3622 %arrayref317 = load float, float addrspace(13)* %225, align 4, !dbg !3622, !tbaa !494, !alias.scope !83, !noalias !86 %226 = fadd float %arrayref313, %arrayref317, !dbg !3626 %227 = call fastcc float @julia_gelu_2643(float %226) #78, !dbg !3628 %arraysize319 = load i64, i64 addrspace(11)* %193, align 8, !dbg !3633, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %228 = mul i64 %arraysize319, %value_phi300.op, !dbg !3633 %229 = add i64 %228, %iv21, !dbg !3633 %arrayptr322566 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %212, align 16, !dbg !3633, !tbaa !68, !alias.scope !3204, !noalias !337, !nonnull !63 %230 = getelementptr inbounds float, float addrspace(13)* %arrayptr322566, i64 %229, !dbg !3633 store float %227, float addrspace(13)* %230, align 4, !dbg !3633, !tbaa !494, !alias.scope !83, !noalias !3159 %exitcond.not = icmp eq i64 %iv.next22, %arraysize247, !dbg !3635 %231 = bitcast {} addrspace(10)* %0 to float addrspace(13)* addrspace(10)*, !dbg !3618 %232 = bitcast float addrspace(13)* addrspace(10)* %231 to {} addrspace(10)*, !dbg !3618 br i1 %exitcond.not, label %L1005.loopexit, label %L963, !dbg !3618, !llvm.loop !3636 L1005.loopexit: ; preds = %L963 br label %L1005, !dbg !3637 L1005: ; preds = %L1005.loopexit, %L954 %233 = add nuw nsw i64 %iv.next20, 1, !dbg !3637 %exitcond596.not = icmp eq i64 %iv.next20, %arraysize249, !dbg !3641 br i1 %exitcond596.not, label %L1042.loopexit, label %L954, !dbg !3640 L1042.loopexit: ; preds = %L1005 br label %L1042 L1042.loopexit1: ; preds = %L194 br label %L1042 L1042.loopexit2: ; preds = %L804 br label %L1042 L1042: ; preds = %L1042.loopexit2, %L1042.loopexit1, %L1042.loopexit, %L929, %L678, %L221, %L62 call void @llvm.lifetime.end.p0i8(i64 noundef 88, i8* noundef nonnull %.sub) #78 ret void, !dbg !3642 guard_exit374: ; preds = %L62 %arrayptr_ptr.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %8, i64 0, i32 0 %arrayptr.pre = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !3180, !tbaa !68, !alias.scope !3204, !noalias !337 %234 = addrspacecast {} addrspace(10)* %1 to float addrspace(13)* addrspace(11)* %235 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !3643 %236 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %235 to i8 addrspace(13)* addrspace(10)*, !dbg !3643 %237 = bitcast i8 addrspace(13)* addrspace(10)* %236 to {} addrspace(10)*, !dbg !3643 br label %L86, !dbg !3643 guard_exit379: ; preds = %L194, %L86 %value_phi78573 = phi i64 [ %49, %L194 ], [ %value_phi41, %L86 ] %value_phi77572 = phi i64 [ 1, %L194 ], [ %44, %L86 ] %arraysize54.pre = load i64, i64 addrspace(11)* %15, align 16, !dbg !3183, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraylen64.pre = load i64, i64 addrspace(11)* %arraylen_ptr2, align 8, !dbg !3171, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %238 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !3643 %239 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %238 to i8 addrspace(13)* addrspace(10)*, !dbg !3643 %240 = bitcast i8 addrspace(13)* addrspace(10)* %239 to {} addrspace(10)*, !dbg !3643 br label %L86, !dbg !3643 } ; Function Attrs: mustprogress willreturn define internal fastcc void @diffejulia___apply_bias_activation___2450({} addrspace(10)* align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="137517761287184" "enzymejl_parmtype_ref"="2" %0, {} addrspace(10)* align 16 "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="137517761287184" "enzymejl_parmtype_ref"="2" %"'", {} addrspace(10)* align 16 dereferenceable(40) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="137517789463248" "enzymejl_parmtype_ref"="2" %1, {} addrspace(10)* align 16 "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@float, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}" "enzymejl_parmtype"="137517789463248" "enzymejl_parmtype_ref"="2" %"'1", { i8*, i8*, {} addrspace(10)*, {} addrspace(10)*, i64, i1, i64, i64, i64*, i64*, float*, i64*, i64, i32*, i64, i64, i1, i1, i64*, i1*, float*, i64*, i1*, i1**, i1**, i64*, i1*, float*, i64*, i64, i64, i1, i1, i64*, float*, i64*, i64*, i64* } %tapeArg) unnamed_addr #74 !dbg !4393 { top: %_replacementA18 = phi i8* %_replacementA17 = phi { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %.sub_replacementA = phi i8* %2 = call {}*** @julia.get_pgcstack() #78 %current_task1530_replacementA = phi {}*** %current_task1_replacementA = phi {}** %ptls_field531_replacementA = phi {}*** %_replacementA16 = phi i64*** %ptls_load532533_replacementA = phi i64** %_replacementA15 = phi i64** %safepoint_replacementA = phi i64* %_replacementA14 = phi {} addrspace(11)* , !dbg !4394 %"'ipc29" = addrspacecast {} addrspace(10)* %"'" to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !4394 %_replacementA13 = phi { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* , !dbg !4394 %arraylen_ptr_replacementA = phi i64 addrspace(11)* , !dbg !4394 %arraylen_replacementA = phi i64 , !dbg !4394 %_replacementA12 = phi {} addrspace(11)* , !dbg !4407 %_replacementA11 = phi { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* , !dbg !4407 %arraylen_ptr2_replacementA = phi i64 addrspace(11)* , !dbg !4407 %arraylen3 = load i64, i64 addrspace(11)* %arraylen_ptr2_replacementA, align 8, !dbg !4407, !tbaa !250, !range !253, !alias.scope !4410, !noalias !4413 %_replacementA = phi i64 , !dbg !4415 %3 = icmp ult i64 %_replacementA, 100001, !dbg !4418 br i1 %3, label %L811, label %L7, !dbg !4406 L7: ; preds = %top %_replacementA21 = phi {} addrspace(10)* addrspace(11)* , !dbg !4420 %arraysize_ptr_replacementA = phi {} addrspace(10)* addrspace(11)* , !dbg !4420 %_replacementA20 = phi i64 addrspace(11)* , !dbg !4420 %arraysize = load i64, i64 addrspace(11)* %_replacementA20, align 8, !dbg !4420, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428 store i64 %arraysize, i64* %arraysize_cache, align 8, !dbg !4420, !tbaa !68, !invariant.group !4430 %arraysize_ptr4_replacementA = phi {} addrspace(10)* addrspace(11)* , !dbg !4420 %_replacementA19 = phi i64 addrspace(11)* , !dbg !4420 %arraysize5 = load i64, i64 addrspace(11)* %_replacementA19, align 16, !dbg !4420, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428 store i64 %arraysize5, i64* %arraysize5_cache, align 8, !dbg !4431, !tbaa !68, !invariant.group !4437 %4 = icmp eq i64 %arraylen3, %arraysize, !dbg !4431 %5 = icmp eq i64 %arraysize, 1, !dbg !4433 %value_phi = or i1 %4, %5, !dbg !4433 br i1 %value_phi, label %L40, label %L28, !dbg !4434 L28: ; preds = %L7 %.not559_replacementA = phi i1 , !dbg !4433 br i1 %.not559_replacementA, label %L40, label %L36, !dbg !4434 L36: ; preds = %L28 %_replacementA27 = phi {} addrspace(10)* , !dbg !4434 %_replacementA26 = phi {} addrspace(10)* , !dbg !4434 %_replacementA25 = phi {} addrspace(10)* , !dbg !4434 %box_replacementA = phi {} addrspace(10)* , !dbg !4434 %_replacementA24 = phi [1 x {} addrspace(10)*] addrspace(10)* , !dbg !4434 %_replacementA23 = phi {} addrspace(10)* addrspace(10)* , !dbg !4434 %_replacementA22 = phi {} addrspace(12)* , !dbg !4434 unreachable L40: ; preds = %L28, %L7 %.sroa.0449.0 = phi i64 [ %arraylen3, %L7 ], [ %arraysize, %L28 ] %6 = call i64 @julia_nthreads_2651() #78, !dbg !4438 %.not = icmp eq i64 %6, 1, !dbg !4440 br i1 %.not, label %L62, label %L221, !dbg !4441 L62: ; preds = %L40 %7 = icmp ne i64 %.sroa.0449.0, 0, !dbg !4442 %8 = icmp ne i64 %arraysize5, 0, !dbg !4442 %.demorgan = and i1 %8, %7, !dbg !4446 br i1 %.demorgan, label %guard_exit374, label %L1042, !dbg !4446 L86: ; preds = %guard_exit379, %guard_exit374 %iv17 = phi i64 [ %iv.next18, %guard_exit379 ], [ 0, %guard_exit374 ], !dbg !4447 %arraylen64 = phi i64 [ %arraylen3, %guard_exit374 ], [ %arraylen64.pre, %guard_exit379 ], !dbg !4447 %9 = phi {} addrspace(10)* [ %"'ipc49", %guard_exit374 ], [ %"'ipc52", %guard_exit379 ], !dbg !4456 %nodecayed.arrayptr_replacementA = phi {} addrspace(10)* , !dbg !4456 %arraysize54 = phi i64 [ %arraysize5, %guard_exit374 ], [ %arraysize54.pre, %guard_exit379 ], !dbg !4459 %arraysize62 = phi i64 [ %arraysize, %guard_exit374 ], [ %arraysize72, %guard_exit379 ], !dbg !4456 %value_phi40 = phi i64 [ 1, %guard_exit374 ], [ %value_phi77572, %guard_exit379 ] %value_phi41 = phi i64 [ 1, %guard_exit374 ], [ %value_phi78573, %guard_exit379 ] %iv.next18 = add nuw nsw i64 %iv17, 1, !dbg !4462 %10 = load i64*, i64** %arraysize54.pre_cache, align 8, !dbg !4462 %11 = bitcast i64* %10 to i8*, !dbg !4462 %arraysize54.pre_realloccache = call i8* @__enzyme_exponentialallocationzero(i8* %11, i64 %iv.next18, i64 8), !dbg !4462 %12 = bitcast i8* %arraysize54.pre_realloccache to i64*, !dbg !4462 store i64* %12, i64** %arraysize54.pre_cache, align 8, !dbg !4462 %13 = load i64*, i64** %arraylen64.pre_cache, align 8, !dbg !4462 %14 = bitcast i64* %13 to i8*, !dbg !4462 %arraylen64.pre_realloccache = call i8* @__enzyme_exponentialallocationzero(i8* %14, i64 %iv.next18, i64 8), !dbg !4462 %15 = bitcast i8* %arraylen64.pre_realloccache to i64*, !dbg !4462 store i64* %15, i64** %arraylen64.pre_cache, align 8, !dbg !4462 %16 = load float*, float** %_cache, align 8, !dbg !4462 %17 = bitcast float* %16 to i8*, !dbg !4462 %_realloccache = call i8* @__enzyme_exponentialallocationzero(i8* %17, i64 %iv.next18, i64 4), !dbg !4462 %18 = bitcast i8* %_realloccache to float*, !dbg !4462 store float* %18, float** %_cache, align 4, !dbg !4462 %19 = load i64*, i64** %value_phi40_cache, align 8, !dbg !4462 %20 = bitcast i64* %19 to i8*, !dbg !4462 %value_phi40_realloccache = call i8* @__enzyme_exponentialallocationzero(i8* %20, i64 %iv.next18, i64 8), !dbg !4462 %21 = bitcast i8* %value_phi40_realloccache to i64*, !dbg !4462 store i64* %21, i64** %value_phi40_cache, align 8, !dbg !4462 %22 = load i64*, i64** %value_phi40_cache, align 8, !dbg !4462, !dereferenceable !880, !invariant.group !4464 %23 = getelementptr inbounds i64, i64* %22, i64 %iv17, !dbg !4462 store i64 %value_phi40, i64* %23, align 8, !dbg !4462, !invariant.group !4465 %24 = load i64*, i64** %value_phi41_cache, align 8, !dbg !4462 %25 = bitcast i64* %24 to i8*, !dbg !4462 %value_phi41_realloccache = call i8* @__enzyme_exponentialallocationzero(i8* %25, i64 %iv.next18, i64 8), !dbg !4462 %26 = bitcast i8* %value_phi41_realloccache to i64*, !dbg !4462 store i64* %26, i64** %value_phi41_cache, align 8, !dbg !4462 %27 = load i64*, i64** %value_phi41_cache, align 8, !dbg !4462, !dereferenceable !880, !invariant.group !4466 %28 = getelementptr inbounds i64, i64* %27, i64 %iv17, !dbg !4462 store i64 %value_phi41, i64* %28, align 8, !dbg !4462, !invariant.group !4467 %29 = load i64*, i64** %arraysize72_cache, align 8, !dbg !4462 %30 = bitcast i64* %29 to i8*, !dbg !4462 %arraysize72_realloccache = call i8* @__enzyme_exponentialallocationzero(i8* %30, i64 %iv.next18, i64 8), !dbg !4462 %31 = bitcast i8* %arraysize72_realloccache to i64*, !dbg !4462 store i64* %31, i64** %arraysize72_cache, align 8, !dbg !4462 %"'ipc53" = bitcast {} addrspace(10)* %9 to i8 addrspace(13)* addrspace(10)*, !dbg !4462 %_replacementA67 = phi i8 addrspace(13)* addrspace(10)* , !dbg !4462 %"'ipc54" = addrspacecast i8 addrspace(13)* addrspace(10)* %"'ipc53" to i8 addrspace(13)* addrspace(11)*, !dbg !4462 %_replacementA66 = phi i8 addrspace(13)* addrspace(11)* , !dbg !4462 %"'ipl" = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %"'ipc54", align 8, !dbg !4462, !alias.scope !4468, !noalias !4471 %_replacementA65 = phi i8 addrspace(13)* , !dbg !4462 %.not534 = icmp eq i64 %arraysize62, 1, !dbg !4462 %.not535 = icmp eq i64 %arraysize54, 1, !dbg !4473 %value_phi40.op = add i64 %value_phi40, -1, !dbg !4456 %32 = select i1 %.not534, i64 0, i64 %value_phi40.op, !dbg !4456 %value_phi41.op = add i64 %value_phi41, -1, !dbg !4456 %33 = select i1 %.not535, i64 0, i64 %value_phi41.op, !dbg !4456 %34 = mul i64 %33, %arraysize62, !dbg !4456 %35 = add i64 %34, %32, !dbg !4456 %"'ipc45" = bitcast i8 addrspace(13)* %"'ipl" to float addrspace(13)*, !dbg !4456 %_replacementA64 = phi float addrspace(13)* , !dbg !4456 %"'ipg46" = getelementptr inbounds float, float addrspace(13)* %"'ipc45", i64 %35, !dbg !4456 %_replacementA63 = phi float addrspace(13)* , !dbg !4456 %arrayref_replacementA = phi float , !dbg !4456 %.not536 = icmp eq i64 %arraylen64, 1, !dbg !4475 %36 = select i1 %.not536, i64 0, i64 %value_phi40.op, !dbg !4477 %"arrayptr69538'ipl" = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %"'ipc40", align 16, !dbg !4477, !tbaa !271, !alias.scope !4479, !noalias !4480, !nonnull !63 %arrayptr69538_replacementA = phi float addrspace(13)* , !dbg !4477 %"'ipg39" = getelementptr inbounds float, float addrspace(13)* %"arrayptr69538'ipl", i64 %36, !dbg !4477 %_replacementA44 = phi float addrspace(13)* , !dbg !4477 %arrayref70_replacementA = phi float , !dbg !4477 %37 = fadd float %arrayref_replacementA, %arrayref70_replacementA, !dbg !4481 %_replacementA37 = phi float , !dbg !4483 %arraysize72 = load i64, i64 addrspace(11)* %_replacementA20, align 8, !dbg !4488, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428 %38 = mul i64 %arraysize72, %value_phi41.op, !dbg !4488 %39 = add i64 %38, %value_phi40.op, !dbg !4488 %"arrayptr75'ipl" = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %"arrayptr_ptr.phi.trans.insert'ipg", align 16, !dbg !4488, !tbaa !68, !alias.scope !4490, !noalias !4491, !nonnull !63 %arrayptr75_replacementA = phi i8 addrspace(13)* , !dbg !4488 %"'ipc" = bitcast i8 addrspace(13)* %"arrayptr75'ipl" to float addrspace(13)*, !dbg !4488 %_replacementA35 = phi float addrspace(13)* , !dbg !4488 %"'ipg" = getelementptr inbounds float, float addrspace(13)* %"'ipc", i64 %39, !dbg !4488 %_replacementA34 = phi float addrspace(13)* , !dbg !4488 %40 = load i64*, i64** %arraysize72_cache, align 8, !dbg !4492, !dereferenceable !880, !invariant.group !4495 %41 = getelementptr inbounds i64, i64* %40, i64 %iv17, !dbg !4492 store i64 %arraysize72, i64* %41, align 8, !dbg !4492, !tbaa !68, !invariant.group !4496 %42 = load float*, float** %_cache, align 8, !dbg !4492, !dereferenceable !880, !invariant.group !4497 %43 = getelementptr inbounds float, float* %42, i64 %iv17, !dbg !4492 store float %37, float* %43, align 4, !dbg !4492, !invariant.group !4498 %44 = add i64 %value_phi40, 1, !dbg !4492 %45 = icmp ugt i64 %value_phi40, 9223372036854775806, !dbg !4499 %46 = icmp sgt i64 %44, %.sroa.0449.0, !dbg !4499 %47 = or i1 %45, %46, !dbg !4502 %48 = icmp eq i64 %value_phi40, %.sroa.0449.0 %or.cond = or i1 %48, %47, !dbg !4502 br i1 %or.cond, label %L194, label %guard_exit379, !dbg !4502 L194: ; preds = %L86 %49 = add i64 %value_phi41, 1, !dbg !4503 %50 = icmp ult i64 %value_phi41, 9223372036854775807, !dbg !4506 %51 = icmp sle i64 %49, %arraysize5, !dbg !4506 %52 = and i1 %50, %51, !dbg !4510 %53 = icmp ne i64 %value_phi41, %arraysize5, !dbg !4509 %value_phi95 = and i1 %53, %52, !dbg !4509 br i1 %value_phi95, label %guard_exit379, label %L1042.loopexit1, !dbg !4446 L221: ; preds = %L40 %.not540 = icmp eq i64 %arraysize5, 0, !dbg !4511 br i1 %.not540, label %L1042, label %L227, !dbg !4513 L227: ; preds = %L221 %54 = call i64 @llvm.smin.i64(i64 %6, i64 %arraysize5) #78, !dbg !4515 %.not541 = icmp eq i64 %54, 0, !dbg !4517 br i1 %.not541, label %L691.lr.ph, label %L235, !dbg !4518 L235: ; preds = %L227 %55 = trunc i64 %54 to i32, !dbg !4519 %56 = add i32 %55, -1, !dbg !4519 %_replacementA68 = phi {}* , !dbg !4523 %57 = icmp sgt i32 %56, 0, !dbg !4525 br i1 %57, label %L245, label %L691.lr.ph, !dbg !4526 L245: ; preds = %L235 %p.i_replacementA = phi i64* , !dbg !4528 %v.i_replacementA = phi i64 , !dbg !4528 %58 = call i64 @llvm.ctpop.i64(i64 %v.i_replacementA) #78, !dbg !4531, !range !2055 %59 = trunc i64 %58 to i32, !dbg !4533 %60 = sub nsw i32 %56, %59, !dbg !4534 %61 = icmp slt i32 %60, 0, !dbg !4536 br i1 %61, label %L258, label %L293, !dbg !4539 L258: ; preds = %L245 %_replacementA70 = phi i64 , !dbg !4540 %_replacementA69 = phi i32 , !dbg !4542 br label %L261, !dbg !4543 L261: ; preds = %L261, %L258 %iv = phi i64 [ %iv.next, %L261 ], [ 0, %L258 ] %value_phi239_replacementA = phi i32 %value_phi240_replacementA = phi i32 %value_phi241_replacementA = phi i64 %iv.next = add nuw nsw i64 %iv, 1, !dbg !4548 %_replacementA79 = phi i32 , !dbg !4548 %_replacementA78 = phi i32 , !dbg !4550 %_replacementA77 = phi i64 , !dbg !4552 %_replacementA76 = phi i1 , !dbg !4552 %notmask_replacementA = phi i64 , !dbg !4550 %.op_replacementA = phi i64 , !dbg !4550 %_replacementA75 = phi i64 , !dbg !4550 %_replacementA74 = phi i64 , !dbg !4553 %_replacementA73 = phi i64 , !dbg !4555 %_replacementA72 = phi i64 , !dbg !4556 %_replacementA71 = phi i32 , !dbg !4558 %62 = add i32 %value_phi240_replacementA, %_replacementA71, !dbg !4559 %.not558 = icmp eq i32 %62, 0, !dbg !4560 br i1 %.not558, label %L282, label %L261, !dbg !4561 L282: ; preds = %L261 %_replacementA81 = phi i64 , !dbg !4562 %_replacementA80 = phi i64 , !dbg !4564 br label %L293, !dbg !4565 L293: ; preds = %L282, %L245 %value_phi155 = phi i32 [ %56, %L282 ], [ %59, %L245 ] %value_phi156 = phi i64 [ %_replacementA74, %L282 ], [ %v.i_replacementA, %L245 ] %63 = icmp sgt i32 %value_phi155, 0, !dbg !4568 br i1 %63, label %L361.lr.ph, label %L691.lr.ph, !dbg !4569 L361.lr.ph: ; preds = %L293 %64 = zext i32 %value_phi155 to i64, !dbg !4570 %65 = add nuw nsw i64 %64, 1, !dbg !4587 %66 = udiv i64 %arraysize5, %65, !dbg !4589 %67 = mul i64 %66, %65, !dbg !4590 %68 = sub i64 %arraysize5, %67, !dbg !4592 %69 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %_replacementA14) #79, !dbg !4593 %"'ip_phi" = phi {}* , !dbg !4593 %70 = bitcast {}* %69 to i8**, !dbg !4593 %arrayptr159 = load i8*, i8** %70, align 8, !dbg !4593, !tbaa !68, !alias.scope !336, !noalias !337, !nonnull !63 %"arrayptr159'il_phi" = phi i8* , !dbg !4593 %71 = ptrtoint i8* %arrayptr159 to i64, !dbg !4593 %arraysize161 = load i64, i64 addrspace(11)* %_replacementA20, align 8, !dbg !4601, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraysize163 = load i64, i64 addrspace(11)* %_replacementA19, align 16, !dbg !4601, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %72 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %_replacementA12) #79, !dbg !4607 %"'ip_phi2" = phi {}* , !dbg !4607 %73 = bitcast {}* %72 to i8**, !dbg !4607 %arrayptr179 = load i8*, i8** %73, align 8, !dbg !4607, !tbaa !271, !alias.scope !254, !noalias !255, !nonnull !63 %"arrayptr179'il_phi" = phi i8* , !dbg !4607 %74 = ptrtoint i8* %arrayptr179 to i64, !dbg !4607 %arraylen181 = load i64, i64 addrspace(11)* %arraylen_ptr2_replacementA, align 8, !dbg !4617, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %75 = insertvalue [2 x {} addrspace(10)*] zeroinitializer, {} addrspace(10)* %0, 0, !dbg !4623 %76 = insertvalue [2 x {} addrspace(10)*] %75, {} addrspace(10)* %1, 1, !dbg !4623 %newstruct187.sroa.0.0..sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 0, i64 0, i64 0, i64 0, !dbg !4624 store i64 %.sroa.0449.0, i64* %newstruct187.sroa.0.0..sroa_idx, align 16, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.2.sroa.0.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 1, i32 0, !dbg !4624 store i64 %71, i64* %newstruct187.sroa.2.sroa.0.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx, align 8, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.2.sroa.2.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx415 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 1, i32 1, i64 0, !dbg !4624 store i64 %arraysize161, i64* %newstruct187.sroa.2.sroa.2.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx415, align 16, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.2.sroa.3.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx416 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 1, i32 1, i64 1, !dbg !4624 store i64 %arraysize163, i64* %newstruct187.sroa.2.sroa.3.0.newstruct187.sroa.2.0..sroa_cast.sroa_idx416, align 8, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.0.sroa.0.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 0, i32 0, i32 0, !dbg !4624 store i64 %71, i64* %newstruct187.sroa.3.sroa.0.sroa.0.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx, align 16, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.0.sroa.2.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx411 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 0, i32 0, i32 1, i64 0, !dbg !4624 store i64 %arraysize161, i64* %newstruct187.sroa.3.sroa.0.sroa.2.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx411, align 8, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.0.sroa.3.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx412 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 0, i32 0, i32 1, i64 1, !dbg !4624 store i64 %arraysize163, i64* %newstruct187.sroa.3.sroa.0.sroa.3.0.newstruct187.sroa.3.sroa.0.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx412, align 16, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.2.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx405 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 0, i32 1, i32 0, !dbg !4624 store i64 %74, i64* %newstruct187.sroa.3.sroa.2.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx405, align 8, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.3.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx406 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 0, i32 1, i32 1, i64 0, !dbg !4624 store i64 %arraylen181, i64* %newstruct187.sroa.3.sroa.3.0.newstruct187.sroa.3.0..sroa_cast.sroa_idx406, align 16, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.4.sroa.0.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 1, i64 0, i64 0, !dbg !4624 store i64 %.sroa.0449.0, i64* %newstruct187.sroa.3.sroa.4.sroa.0.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx, align 8, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %newstruct187.sroa.3.sroa.4.sroa.2.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx446 = getelementptr inbounds { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, i64 0, i32 2, i32 1, i64 1, i64 0, !dbg !4624 store i64 %arraysize5, i64* %newstruct187.sroa.3.sroa.4.sroa.2.0.newstruct187.sroa.3.sroa.4.0.newstruct187.sroa.3.0..sroa_cast.sroa_cast.sroa_idx446, align 16, !dbg !4624, !tbaa !682, !alias.scope !2185, !noalias !4625 %77 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* nonnull %0, [2 x {} addrspace(10)*] %76) #78, !dbg !4600 %"'ip" = call token (...) @llvm.julia.gc_preserve_begin(), !dbg !4600 %78 = icmp sgt i64 %68, -1 %79 = add nsw i64 %64, -1, !dbg !4628 br label %L361, !dbg !4628 L427.preheader: ; preds = %L415 %.lcssa = phi i1 [ %94, %L415 ], !dbg !4629 %value_phi193592.lcssa = phi i64 [ %value_phi193592, %L415 ] %value_phi201587 = add i64 %97, 1, !dbg !4633 %.not549588 = icmp sgt i64 %value_phi201587, %arraysize5, !dbg !4634 br i1 %.not549588, label %L641.preheader, label %L430.lr.ph, !dbg !4635 L430.lr.ph: ; preds = %L427.preheader %80 = icmp eq i64 %.sroa.0449.0, 0 %.not551 = icmp eq i64 %arraysize161, 1 %.not552 = icmp eq i64 %arraysize163, 1 %.not553 = icmp eq i64 %arraylen181, 1 %81 = add nsw i64 %.sroa.0449.0, -1, !dbg !4635 %umin600 = call i64 @llvm.umin.i64(i64 %81, i64 noundef 9223372036854775806) #78, !dbg !4635 %82 = add nuw nsw i64 %umin600, 1 %83 = add i64 %66, %value_phi193592.lcssa, !dbg !4636 %umin = call i1 @llvm.umin.i1(i1 %.lcssa, i1 %78), !dbg !4635 %84 = zext i1 %umin to i64, !dbg !4635 %85 = add i64 %83, %84, !dbg !4636 %86 = add i64 %arraysize5, -1, !dbg !4635 %87 = sub i64 %86, %66, !dbg !4635 %88 = sub i64 %87, %value_phi193592.lcssa, !dbg !4635 %umin4 = call i1 @llvm.umin.i1(i1 %.lcssa, i1 %78), !dbg !4635 %89 = zext i1 %umin4 to i64, !dbg !4635 %90 = sub i64 %88, %89, !dbg !4635 br label %L430, !dbg !4635 L361: ; preds = %L415, %L361.lr.ph %iv3 = phi i64 [ %iv.next4, %L415 ], [ 0, %L361.lr.ph ] %value_phi195594 = phi i64 [ %value_phi156, %L361.lr.ph ], [ %103, %L415 ] %value_phi193592 = phi i64 [ 0, %L361.lr.ph ], [ %97, %L415 ] %value_phi192591 = phi i32 [ 0, %L361.lr.ph ], [ %99, %L415 ] %iv.next4 = add nuw nsw i64 %iv3, 1, !dbg !4637 %91 = icmp ne i64 %value_phi195594, 0, !dbg !4637 call void @llvm.assume(i1 noundef %91) #78, !dbg !4640 %92 = call i64 @llvm.cttz.i64(i64 %value_phi195594, i1 noundef true) #78, !dbg !4641, !range !2055 %93 = trunc i64 %92 to i32, !dbg !4643 %94 = icmp ugt i64 %68, %iv3, !dbg !4629 %not.ifelse_cond196 = and i1 %78, %94, !dbg !4644 %95 = zext i1 %not.ifelse_cond196 to i64, !dbg !4644 %96 = add i64 %value_phi193592, %66, !dbg !4644 %97 = add i64 %96, %95, !dbg !4645 %98 = add nuw nsw i32 %93, 1, !dbg !4646 %99 = add i32 %98, %value_phi192591, !dbg !4648 %100 = zext i32 %98 to i64, !dbg !4650 %101 = lshr i64 %value_phi195594, %100, !dbg !4650 %102 = icmp eq i32 %93, 63, !dbg !4650 %103 = select i1 %102, i64 0, i64 %101, !dbg !4650 %104 = load i64, i64* inttoptr (i64 137517345406912 to i64*), align 64, !dbg !4652, !tbaa !131, !alias.scope !83, !noalias !86 %"'il_phi3" = phi i64 , !dbg !4658 %105 = shl i32 %99, 9, !dbg !4658 %106 = zext i32 %105 to i64, !dbg !4659 %107 = inttoptr i64 %104 to i8*, !dbg !4663 %108 = getelementptr i8, i8* %107, i64 %106, !dbg !4663 %109 = getelementptr i8, i8* %108, i64 8, !dbg !4664 %coercion = bitcast i8* %109 to i64*, !dbg !4670 store i64 ptrtoint (void (i64)* @jlcapi_BatchClosure_2456 to i64), i64* %coercion, align 1, !dbg !4670, !tbaa !81, !alias.scope !83, !noalias !4674 %110 = getelementptr i8, i8* %108, i64 16, !dbg !4675 %111 = bitcast i8* %110 to { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }**, !dbg !4679 store { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }* %_replacementA17, { [1 x [1 x [1 x i64]]], { i64, [2 x i64] }, { { { i64, [2 x i64] }, { i64, [1 x i64] } }, [2 x [1 x i64]] } }** %111, align 1, !dbg !4679, !tbaa !81, !alias.scope !83, !noalias !4674 %112 = getelementptr i8, i8* %108, i64 24, !dbg !4683 %coercion198 = bitcast i8* %112 to i64*, !dbg !4687 store i64 %value_phi193592, i64* %coercion198, align 1, !dbg !4687, !tbaa !81, !alias.scope !83, !noalias !4674 %113 = getelementptr i8, i8* %108, i64 32, !dbg !4691 %coercion199 = bitcast i8* %113 to i64*, !dbg !4695 store i64 %97, i64* %coercion199, align 1, !dbg !4695, !tbaa !81, !alias.scope !83, !noalias !4674 %p.i386 = bitcast i8* %108 to i32*, !dbg !4699 %v.i387 = atomicrmw xchg i32* %p.i386, i32 0 acq_rel, align 4, !dbg !4699 %.not548 = icmp eq i32 %v.i387, 1, !dbg !4702 br i1 %.not548, label %L412, label %L415, !dbg !4703 L412: ; preds = %L361 call fastcc void @julia_wake_thread__2634(i32 zeroext %99) #78, !dbg !4703 br label %L415, !dbg !4703 L415: ; preds = %L412, %L361 %114 = icmp eq i64 %iv.next4, %64, !dbg !4704 br i1 %114, label %L427.preheader, label %L361, !dbg !4628 L641.preheader.loopexit: ; preds = %L622 br label %L641.preheader, !dbg !4706 L641.preheader: ; preds = %L641.preheader.loopexit, %L427.preheader %115 = icmp eq i64 %value_phi156, 0, !dbg !4706 br i1 %115, label %L678, label %L646.preheader, !dbg !4708 L646.preheader: ; preds = %L641.preheader br label %L646, !dbg !4709 L430: ; preds = %L622, %L430.lr.ph %iv5 = phi i64 [ %iv.next6, %L622 ], [ 0, %L430.lr.ph ] %iv.next6 = add nuw nsw i64 %iv5, 1, !dbg !4636 %116 = add i64 %85, %iv5, !dbg !4636 %117 = add i64 %value_phi201587, %iv5, !dbg !4636 br i1 %80, label %L622, label %L442.preheader, !dbg !4636 L442.preheader: ; preds = %L430 %118 = select i1 %.not552, i64 0, i64 %116 %119 = mul i64 %118, %arraysize161 %120 = mul i64 %116, %arraysize161 br label %L442, !dbg !4544 L442: ; preds = %L442, %L442.preheader %iv7 = phi i64 [ %iv.next8, %L442 ], [ 0, %L442.preheader ] %iv.next8 = add nuw nsw i64 %iv7, 1, !dbg !4712 %121 = select i1 %.not551, i64 1, i64 %iv.next8, !dbg !4712 %122 = add i64 %121, %119, !dbg !4721 %123 = shl i64 %122, 2, !dbg !4729 %124 = add i64 %123, -4, !dbg !4729 %125 = getelementptr i8, i8* %arrayptr159, i64 %124, !dbg !4732 %coercion216 = bitcast i8* %125 to float*, !dbg !4733 %pointerref = load float, float* %coercion216, align 1, !dbg !4733, !tbaa !81, !alias.scope !83, !noalias !86 %value_phi206.op = shl i64 %iv.next8, 2, !dbg !4737 %value_phi206.op.op = add i64 %value_phi206.op, -4, !dbg !4737 %126 = select i1 %.not553, i64 0, i64 %value_phi206.op.op, !dbg !4737 %127 = getelementptr i8, i8* %arrayptr179, i64 %126, !dbg !4744 %coercion219 = bitcast i8* %127 to float*, !dbg !4745 %pointerref220 = load float, float* %coercion219, align 1, !dbg !4745, !tbaa !81, !alias.scope !83, !noalias !86 %128 = fadd float %pointerref, %pointerref220, !dbg !4749 call void @llvm.lifetime.end.p0i8(i64 noundef 88, i8* noundef nonnull %.sub_replacementA) #78 %129 = call fastcc float @julia_gelu_2643(float %128) #78, !dbg !4751 %130 = add i64 %iv.next8, %120, !dbg !4756 %131 = shl i64 %130, 2, !dbg !4764 %132 = add i64 %131, -4, !dbg !4764 %133 = getelementptr i8, i8* %arrayptr159, i64 %132, !dbg !4767 %coercion222 = bitcast i8* %133 to float*, !dbg !4768 store float %129, float* %coercion222, align 1, !dbg !4768, !tbaa !81, !alias.scope !83, !noalias !4674 %134 = add nuw nsw i64 %iv.next8, 1, !dbg !4772 %exitcond601.not = icmp eq i64 %iv.next8, %82, !dbg !4775 br i1 %exitcond601.not, label %L622.loopexit, label %L442, !dbg !4544 L622.loopexit: ; preds = %L442 br label %L622, !dbg !4633 L622: ; preds = %L622.loopexit, %L430 %value_phi201 = add i64 %117, 1, !dbg !4633 %exitcond602 = icmp eq i64 %117, %arraysize5, !dbg !4634 br i1 %exitcond602, label %L641.preheader.loopexit, label %L430, !dbg !4635 L646: ; preds = %L676, %L646.preheader %iv9 = phi i64 [ 0, %L646.preheader ], [ %iv.next10, %L676 ] %value_phi236586 = phi i64 [ %139, %L676 ], [ %value_phi156, %L646.preheader ] %value_phi235585 = phi i32 [ %141, %L676 ], [ 0, %L646.preheader ] %iv.next10 = add nuw nsw i64 %iv9, 1, !dbg !4776 %135 = call i64 @llvm.cttz.i64(i64 %value_phi236586, i1 noundef true) #78, !dbg !4776, !range !2055 %136 = trunc i64 %135 to i32, !dbg !4778 %137 = add nuw nsw i32 %136, 1, !dbg !4779 %138 = zext i32 %137 to i64, !dbg !4781 %139 = lshr i64 %value_phi236586, %138, !dbg !4781 %140 = icmp eq i32 %136, 63, !dbg !4781 %141 = add i32 %137, %value_phi235585, !dbg !4783 %142 = load i64, i64* inttoptr (i64 137517345406912 to i64*), align 64, !dbg !4785, !tbaa !131, !alias.scope !83, !noalias !86 %"'il_phi5" = phi i64 , !dbg !4788 %143 = shl i32 %141, 9, !dbg !4788 %144 = zext i32 %143 to i64, !dbg !4789 %145 = inttoptr i64 %142 to i8*, !dbg !4793 %146 = getelementptr i8, i8* %145, i64 %144, !dbg !4793 %p.i388 = bitcast i8* %146 to i32*, !dbg !4794 %v.i389582 = load atomic i32, i32* %p.i388 acquire, align 16, !dbg !4794 %"v.i389582'il_phi" = phi i32 , !dbg !4796 %.not556583 = icmp eq i32 %v.i389582, 0, !dbg !4796 br i1 %.not556583, label %L666.preheader, label %L676, !dbg !4709 L666.preheader: ; preds = %L646 br label %L666, !dbg !4797 L666: ; preds = %L673, %L666.preheader %iv11 = phi i64 [ 0, %L666.preheader ], [ %iv.next12, %L673 ] %iv.next12 = add nuw nsw i64 %iv11, 1 %147 = trunc i64 %iv11 to i32 call void @llvm.lifetime.end.p0i8(i64 noundef 88, i8* noundef nonnull %.sub_replacementA) #78 call void asm sideeffect "pause", "~{memory}"() #80, !dbg !4798 %148 = add i32 %147, 1, !dbg !4800 %149 = icmp ult i32 %148, 65537, !dbg !4801 br i1 %149, label %L673, label %L670, !dbg !4797 L670: ; preds = %L666 %150 = call fastcc i8 @julia_checktask_2476(i32 zeroext %141) #78, !dbg !4803 %151 = and i8 %150, 1, !dbg !4803 %.not557 = icmp eq i8 %151, 0, !dbg !4803 br i1 %.not557, label %L673, label %L676.loopexit, !dbg !4803 L673: ; preds = %L670, %L666 %v.i389 = load atomic i32, i32* %p.i388 acquire, align 16, !dbg !4794 %"v.i389'il_phi" = phi i32 , !dbg !4796 %.not556 = icmp eq i32 %v.i389, 0, !dbg !4796 br i1 %.not556, label %L666, label %L676.loopexit, !dbg !4709 L676.loopexit: ; preds = %L673, %L670 br label %L676, !dbg !4706 L676: ; preds = %L676.loopexit, %L646 %152 = icmp eq i64 %139, 0, !dbg !4706 %153 = select i1 %140, i1 true, i1 %152, !dbg !4706 br i1 %153, label %L678.loopexit, label %L646, !dbg !4708 L678.loopexit: ; preds = %L676 br label %L678, !dbg !4804 L678: ; preds = %L678.loopexit, %L641.preheader %v.i391 = atomicrmw or i64* %p.i_replacementA, i64 %value_phi156 acq_rel, align 8, !dbg !4804 br label %L1042, !dbg !4807 L691.lr.ph: ; preds = %L293, %L235, %L227 %154 = icmp eq i64 %.sroa.0449.0, 0 %arrayptr_ptr129.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %_replacementA13, i64 0, i32 0 %155 = addrspacecast {} addrspace(10)* %1 to float addrspace(13)* addrspace(11)* %156 = add nsw i64 %.sroa.0449.0, -1, !dbg !4808 %umin597 = call i64 @llvm.umin.i64(i64 %156, i64 noundef 9223372036854775806) #78, !dbg !4808 %157 = call i64 @llvm.smax.i64(i64 %arraysize5, i64 noundef 1) #78, !dbg !4808 %158 = add nuw nsw i64 %umin597, 1 %159 = add nsw i64 %157, -1, !dbg !4808 br label %L691, !dbg !4808 L691: ; preds = %L804, %L691.lr.ph %iv13 = phi i64 [ %iv.next14, %L804 ], [ 0, %L691.lr.ph ] %iv.next14 = add nuw nsw i64 %iv13, 1, !dbg !4809 br i1 %154, label %L804, label %L691.L703_crit_edge, !dbg !4809 L691.L703_crit_edge: ; preds = %L691 %arraysize117.pre = load i64, i64 addrspace(11)* %_replacementA20, align 8, !dbg !4810, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arrayptr130.pre = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr129.phi.trans.insert, align 16, !dbg !4819, !tbaa !68, !alias.scope !4821, !noalias !337 %"arrayptr130.pre'il_phi" = phi i8 addrspace(13)* %value_phi106.op = add nsw i64 %iv.next14, -1 %160 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4809 %161 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %160 to i8 addrspace(13)* addrspace(10)*, !dbg !4809 %162 = bitcast i8 addrspace(13)* addrspace(10)* %161 to {} addrspace(10)*, !dbg !4809 br label %L703, !dbg !4809 L703: ; preds = %L703, %L691.L703_crit_edge %iv15 = phi i64 [ %iv.next16, %L703 ], [ 0, %L691.L703_crit_edge ], !dbg !4819 %nodecayed.arrayptr130 = phi {} addrspace(10)* [ %162, %L691.L703_crit_edge ], [ %183, %L703 ], !dbg !4819 %arraysize127 = phi i64 [ %arraysize117.pre, %L691.L703_crit_edge ], [ %arraysize141, %L703 ], !dbg !4819 %iv.next16 = add nuw nsw i64 %iv15, 1, !dbg !4810 %163 = bitcast {} addrspace(10)* %nodecayed.arrayptr130 to i8 addrspace(13)* addrspace(10)*, !dbg !4810 %164 = addrspacecast i8 addrspace(13)* addrspace(10)* %163 to i8 addrspace(13)* addrspace(11)*, !dbg !4810 %165 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %164, align 8, !dbg !4810 %"'il_phi6" = phi i8 addrspace(13)* , !dbg !4810 %arraysize119 = load i64, i64 addrspace(11)* %_replacementA19, align 16, !dbg !4810, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %.not543 = icmp eq i64 %arraysize127, 1, !dbg !4822 %.not544 = icmp eq i64 %arraysize119, 1, !dbg !4824 %value_phi111.op = add nsw i64 %iv.next16, -1, !dbg !4819 %166 = select i1 %.not543, i64 0, i64 %value_phi111.op, !dbg !4819 %167 = select i1 %.not544, i64 0, i64 %value_phi106.op, !dbg !4819 %168 = mul i64 %167, %arraysize127, !dbg !4819 %169 = add i64 %168, %166, !dbg !4819 %170 = bitcast i8 addrspace(13)* %165 to float addrspace(13)*, !dbg !4819 %171 = getelementptr inbounds float, float addrspace(13)* %170, i64 %169, !dbg !4819 %arrayref131 = load float, float addrspace(13)* %171, align 4, !dbg !4819, !tbaa !494, !alias.scope !83, !noalias !86 %arraylen133 = load i64, i64 addrspace(11)* %arraylen_ptr2_replacementA, align 8, !dbg !4826, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %.not545 = icmp eq i64 %arraylen133, 1, !dbg !4831 %172 = select i1 %.not545, i64 0, i64 %value_phi111.op, !dbg !4833 %arrayptr138547 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %155, align 16, !dbg !4833, !tbaa !271, !alias.scope !4835, !noalias !255, !nonnull !63 %"arrayptr138547'il_phi" = phi float addrspace(13)* , !dbg !4833 %173 = getelementptr inbounds float, float addrspace(13)* %arrayptr138547, i64 %172, !dbg !4833 %arrayref139 = load float, float addrspace(13)* %173, align 4, !dbg !4833, !tbaa !494, !alias.scope !83, !noalias !86 %174 = fadd float %arrayref131, %arrayref139, !dbg !4836 %175 = call fastcc float @julia_gelu_2643(float %174) #78, !dbg !4838 %arraysize141 = load i64, i64 addrspace(11)* %_replacementA20, align 8, !dbg !4843, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %176 = mul i64 %arraysize141, %value_phi106.op, !dbg !4843 %177 = add i64 %176, %value_phi111.op, !dbg !4843 %arrayptr144 = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr129.phi.trans.insert, align 16, !dbg !4843, !tbaa !68, !alias.scope !4821, !noalias !337, !nonnull !63 %"arrayptr144'il_phi" = phi i8 addrspace(13)* , !dbg !4843 %178 = bitcast i8 addrspace(13)* %arrayptr144 to float addrspace(13)*, !dbg !4843 %179 = getelementptr inbounds float, float addrspace(13)* %178, i64 %177, !dbg !4843 store float %175, float addrspace(13)* %179, align 4, !dbg !4843, !tbaa !494, !alias.scope !83, !noalias !4674 %180 = add nuw nsw i64 %iv.next16, 1, !dbg !4845 %exitcond598.not = icmp eq i64 %iv.next16, %158, !dbg !4848 %181 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4566 %182 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %181 to i8 addrspace(13)* addrspace(10)*, !dbg !4566 %183 = bitcast i8 addrspace(13)* addrspace(10)* %182 to {} addrspace(10)*, !dbg !4566 br i1 %exitcond598.not, label %L804.loopexit, label %L703, !dbg !4566 L804.loopexit: ; preds = %L703 br label %L804, !dbg !4849 L804: ; preds = %L804.loopexit, %L691 %184 = add nuw nsw i64 %iv.next14, 1, !dbg !4849 %exitcond599 = icmp eq i64 %iv.next14, %157, !dbg !4852 br i1 %exitcond599, label %L1042.loopexit2, label %L691, !dbg !4808 L811: ; preds = %top %185 = addrspacecast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(11)*, !dbg !4853 %arraysize_ptr246 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %185, i64 3, !dbg !4853 %186 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr246 to i64 addrspace(11)*, !dbg !4853 %arraysize247 = load i64, i64 addrspace(11)* %186, align 8, !dbg !4853, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraysize_ptr248 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %185, i64 4, !dbg !4853 %187 = bitcast {} addrspace(10)* addrspace(11)* %arraysize_ptr248 to i64 addrspace(11)*, !dbg !4853 %arraysize249 = load i64, i64 addrspace(11)* %187, align 16, !dbg !4853, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %188 = icmp eq i64 %arraysize247, 1, !dbg !4858 %189 = icmp eq i64 %arraysize249, 1, !dbg !4863 %190 = icmp ne i64 %arraysize247, %arraylen3, !dbg !4866 %191 = icmp ne i64 %arraylen3, 1, !dbg !4868 %192 = and i1 %191, %190, !dbg !4869 br i1 %192, label %L860, label %L902, !dbg !4869 L860: ; preds = %L811 call fastcc void @julia_DimensionMismatch_2470() #78, !dbg !4869 %box348 = call noalias nonnull dereferenceable(8) "enzyme_inactive" {} addrspace(10)* @julia.gc_alloc_obj({}** nonnull %current_task1_replacementA, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 137517631542736 to {}*) to {} addrspace(10)*)) #81, !dbg !4869 %193 = bitcast {} addrspace(10)* %box348 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !4869 %194 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %193, i64 0, i64 0, !dbg !4869 store {} addrspace(10)* addrspacecast ({}* inttoptr (i64 137517709292320 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspace(10)* %194, align 8, !dbg !4869, !tbaa !188, !alias.scope !83, !noalias !4674 %195 = addrspacecast {} addrspace(10)* %box348 to {} addrspace(12)*, !dbg !4869 call void @ijl_throw({} addrspace(12)* %195) #82, !dbg !4869 unreachable, !dbg !4869 L902: ; preds = %L811 %196 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %_replacementA14) #79, !dbg !4872 %"'ip_phi7" = phi {}* , !dbg !4872 %197 = bitcast {}* %196 to i8**, !dbg !4872 %arrayptr286 = load i8*, i8** %197, align 8, !dbg !4872, !tbaa !68, !alias.scope !336, !noalias !337, !nonnull !63 %"arrayptr286'il_phi" = phi i8* , !dbg !4872 %198 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* noundef %_replacementA12) #79, !dbg !4872 %"'ip_phi8" = phi {}* , !dbg !4872 %199 = bitcast {}* %198 to i8**, !dbg !4872 %arrayptr288 = load i8*, i8** %199, align 8, !dbg !4872, !tbaa !271, !alias.scope !254, !noalias !255, !nonnull !63 %"arrayptr288'il_phi" = phi i8* , !dbg !4884 %.not560 = icmp eq i8* %arrayptr286, %arrayptr288, !dbg !4884 %200 = bitcast {} addrspace(10)* %1 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4876 br i1 %.not560, label %L924, label %L929, !dbg !4876 L924: ; preds = %L902 %201 = call noalias nonnull {} addrspace(10)* @ijl_array_copy({} addrspace(10)* noundef nonnull %1) #78, !dbg !4887 %"'ip_phi9" = phi {} addrspace(10)* , !dbg !4887 %.phi.trans.insert517 = addrspacecast {} addrspace(10)* %201 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %arraylen_ptr290.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %.phi.trans.insert517, i64 0, i32 1 %arraylen291.pre = load i64, i64 addrspace(11)* %arraylen_ptr290.phi.trans.insert, align 8, !dbg !4889, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %202 = bitcast {} addrspace(10)* %201 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4543 br label %L929, !dbg !4543 L929: ; preds = %L924, %L902 %nodecayed..pre-phi529 = phi { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* [ %202, %L924 ], [ %200, %L902 ], !dbg !4889 %arraylen291 = phi i64 [ %arraylen291.pre, %L924 ], [ %arraylen3, %L902 ], !dbg !4889 %203 = addrspacecast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %nodecayed..pre-phi529 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !4893 %204 = icmp eq i64 %arraylen291, 1, !dbg !4893 %.not561 = icmp eq i64 %arraysize249, 0, !dbg !4897 br i1 %.not561, label %L1042, label %L954.preheader, !dbg !4901 L954.preheader: ; preds = %L929 %.not562 = icmp eq i64 %arraysize247, 0 %205 = addrspacecast {} addrspace(10)* %0 to float addrspace(13)* addrspace(11)* %206 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %203 to float addrspace(13)* addrspace(11)* %207 = add nsw i64 %arraysize249, -1, !dbg !4903 %208 = add nsw i64 %arraysize247, -1, !dbg !4903 br label %L954, !dbg !4903 L954: ; preds = %L1005, %L954.preheader %iv19 = phi i64 [ %iv.next20, %L1005 ], [ 0, %L954.preheader ] %iv.next20 = add nuw nsw i64 %iv19, 1, !dbg !4903 br i1 %.not562, label %L1005, label %L963.lr.ph, !dbg !4903 L963.lr.ph: ; preds = %L954 %value_phi300.op = add nsw i64 %iv.next20, -1 %209 = select i1 %189, i64 0, i64 %value_phi300.op %arraysize309.pre = load i64, i64 addrspace(11)* %186, align 8, !dbg !4904, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arrayptr312564.pre = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %205, align 16, !dbg !4904, !tbaa !68, !alias.scope !4821, !noalias !337 %"arrayptr312564.pre'il_phi" = phi float addrspace(13)* , !dbg !4912 %210 = bitcast {} addrspace(10)* %0 to float addrspace(13)* addrspace(10)*, !dbg !4912 %211 = bitcast float addrspace(13)* addrspace(10)* %210 to {} addrspace(10)*, !dbg !4912 br label %L963, !dbg !4912 L963: ; preds = %L963, %L963.lr.ph %iv21 = phi i64 [ %iv.next22, %L963 ], [ 0, %L963.lr.ph ], !dbg !4904 %nodecayed.arrayptr312564 = phi {} addrspace(10)* [ %211, %L963.lr.ph ], [ %227, %L963 ], !dbg !4904 %arraysize309 = phi i64 [ %arraysize309.pre, %L963.lr.ph ], [ %arraysize319, %L963 ], !dbg !4904 %iv.next22 = add nuw nsw i64 %iv21, 1, !dbg !4913 %212 = bitcast {} addrspace(10)* %nodecayed.arrayptr312564 to float addrspace(13)* addrspace(10)*, !dbg !4913 %213 = addrspacecast float addrspace(13)* addrspace(10)* %212 to float addrspace(13)* addrspace(11)*, !dbg !4913 %214 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %213, align 8, !dbg !4913 %"'il_phi10" = phi float addrspace(13)* , !dbg !4904 %215 = select i1 %188, i64 0, i64 %iv21, !dbg !4904 %216 = mul i64 %arraysize309, %209, !dbg !4904 %217 = add i64 %215, %216, !dbg !4904 %218 = getelementptr inbounds float, float addrspace(13)* %214, i64 %217, !dbg !4904 %arrayref313 = load float, float addrspace(13)* %218, align 4, !dbg !4904, !tbaa !494, !alias.scope !83, !noalias !86 %219 = select i1 %204, i64 0, i64 %iv21, !dbg !4916 %arrayptr316565 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %206, align 8, !dbg !4916, !tbaa !271, !alias.scope !4835, !noalias !255, !nonnull !63 %"arrayptr316565'il_phi" = phi float addrspace(13)* , !dbg !4916 %220 = getelementptr inbounds float, float addrspace(13)* %arrayptr316565, i64 %219, !dbg !4916 %arrayref317 = load float, float addrspace(13)* %220, align 4, !dbg !4916, !tbaa !494, !alias.scope !83, !noalias !86 %221 = fadd float %arrayref313, %arrayref317, !dbg !4920 %222 = call fastcc float @julia_gelu_2643(float %221) #78, !dbg !4922 %arraysize319 = load i64, i64 addrspace(11)* %186, align 8, !dbg !4927, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %223 = mul i64 %arraysize319, %value_phi300.op, !dbg !4927 %224 = add i64 %223, %iv21, !dbg !4927 %arrayptr322566 = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %205, align 16, !dbg !4927, !tbaa !68, !alias.scope !4821, !noalias !337, !nonnull !63 %"arrayptr322566'il_phi" = phi float addrspace(13)* , !dbg !4927 %225 = getelementptr inbounds float, float addrspace(13)* %arrayptr322566, i64 %224, !dbg !4927 store float %222, float addrspace(13)* %225, align 4, !dbg !4927, !tbaa !494, !alias.scope !83, !noalias !4674 %exitcond.not = icmp eq i64 %iv.next22, %arraysize247, !dbg !4929 %226 = bitcast {} addrspace(10)* %0 to float addrspace(13)* addrspace(10)*, !dbg !4912 %227 = bitcast float addrspace(13)* addrspace(10)* %226 to {} addrspace(10)*, !dbg !4912 br i1 %exitcond.not, label %L1005.loopexit, label %L963, !dbg !4912, !llvm.loop !4930 L1005.loopexit: ; preds = %L963 br label %L1005, !dbg !4931 L1005: ; preds = %L1005.loopexit, %L954 %228 = add nuw nsw i64 %iv.next20, 1, !dbg !4931 %exitcond596.not = icmp eq i64 %iv.next20, %arraysize249, !dbg !4935 br i1 %exitcond596.not, label %L1042.loopexit, label %L954, !dbg !4934 L1042.loopexit: ; preds = %L1005 br label %L1042 L1042.loopexit1: ; preds = %L194 %229 = phi i64 [ %iv17, %L194 ] store i64 %229, i64* %loopLimit_cache, align 8, !invariant.group !4936 br label %L1042 L1042.loopexit2: ; preds = %L804 br label %L1042 L1042: ; preds = %L1042.loopexit2, %L1042.loopexit1, %L1042.loopexit, %L929, %L678, %L221, %L62 call void @llvm.lifetime.end.p0i8(i64 noundef 88, i8* noundef nonnull %.sub_replacementA) #78 br label %invertL1042, !dbg !4937 guard_exit374: ; preds = %L62 %"arrayptr_ptr.phi.trans.insert'ipg" = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %"'ipc29", i64 0, i32 0 %arrayptr_ptr.phi.trans.insert = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %_replacementA13, i64 0, i32 0 %arrayptr.pre = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %arrayptr_ptr.phi.trans.insert, align 16, !dbg !4456, !tbaa !68, !alias.scope !4821, !noalias !337 %"arrayptr.pre'il_phi" = phi i8 addrspace(13)* %"'ipc40" = addrspacecast {} addrspace(10)* %"'1" to float addrspace(13)* addrspace(11)* %230 = addrspacecast {} addrspace(10)* %1 to float addrspace(13)* addrspace(11)* %"'ipc47" = bitcast {} addrspace(10)* %"'" to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4938 %231 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4938 %"'ipc48" = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %"'ipc47" to i8 addrspace(13)* addrspace(10)*, !dbg !4938 %232 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %231 to i8 addrspace(13)* addrspace(10)*, !dbg !4938 %"'ipc49" = bitcast i8 addrspace(13)* addrspace(10)* %"'ipc48" to {} addrspace(10)*, !dbg !4938 %233 = bitcast i8 addrspace(13)* addrspace(10)* %232 to {} addrspace(10)*, !dbg !4938 store i64* null, i64** %arraysize72_cache, align 8, !dbg !4938 store i64* null, i64** %value_phi41_cache, align 8, !dbg !4938 store i64* null, i64** %value_phi40_cache, align 8, !dbg !4938 store float* null, float** %_cache, align 8, !dbg !4938 store i64* null, i64** %arraylen64.pre_cache, align 8, !dbg !4938 store i64* null, i64** %arraysize54.pre_cache, align 8, !dbg !4938 br label %L86, !dbg !4938 guard_exit379: ; preds = %L194, %L86 %value_phi78573 = phi i64 [ %49, %L194 ], [ %value_phi41, %L86 ] %value_phi77572 = phi i64 [ 1, %L194 ], [ %44, %L86 ] %arraysize54.pre = load i64, i64 addrspace(11)* %_replacementA19, align 16, !dbg !4459, !tbaa !68, !range !253, !alias.scope !336, !noalias !337 %arraylen64.pre = load i64, i64 addrspace(11)* %arraylen_ptr2_replacementA, align 8, !dbg !4447, !tbaa !250, !range !253, !alias.scope !254, !noalias !255 %234 = load i64*, i64** %arraylen64.pre_cache, align 8, !dbg !4938, !dereferenceable !880, !invariant.group !4939 %235 = getelementptr inbounds i64, i64* %234, i64 %iv17, !dbg !4938 store i64 %arraylen64.pre, i64* %235, align 8, !dbg !4938, !tbaa !250, !invariant.group !4940 %236 = load i64*, i64** %arraysize54.pre_cache, align 8, !dbg !4938, !dereferenceable !880, !invariant.group !4941 %237 = getelementptr inbounds i64, i64* %236, i64 %iv17, !dbg !4938 store i64 %arraysize54.pre, i64* %237, align 8, !dbg !4938, !tbaa !68, !invariant.group !4942 %"'ipc50" = bitcast {} addrspace(10)* %"'" to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4938 %238 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4938 %"'ipc51" = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %"'ipc50" to i8 addrspace(13)* addrspace(10)*, !dbg !4938 %239 = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %238 to i8 addrspace(13)* addrspace(10)*, !dbg !4938 %"'ipc52" = bitcast i8 addrspace(13)* addrspace(10)* %"'ipc51" to {} addrspace(10)*, !dbg !4938 %240 = bitcast i8 addrspace(13)* addrspace(10)* %239 to {} addrspace(10)*, !dbg !4938 br label %L86, !dbg !4938 allocsForInversion: ; No predecessors! %"iv17'ac" = alloca i64, align 8 %loopLimit_cache = alloca i64, align 8 %"iv'ac" = alloca i64, align 8 %"iv3'ac" = alloca i64, align 8 %"iv5'ac" = alloca i64, align 8 %"iv7'ac" = alloca i64, align 8 %"iv9'ac" = alloca i64, align 8 %"iv11'ac" = alloca i64, align 8 %"iv13'ac" = alloca i64, align 8 %"iv15'ac" = alloca i64, align 8 %"iv19'ac" = alloca i64, align 8 %"iv21'ac" = alloca i64, align 8 %arraysize_cache = alloca i64, align 8 %arraysize72_cache = alloca i64*, align 8 %value_phi41_cache = alloca i64*, align 8 %value_phi40_cache = alloca i64*, align 8 %"'de" = alloca float, align 4 %241 = getelementptr float, float* %"'de", i64 0 store float 0.000000e+00, float* %241, align 4 %_cache = alloca float*, align 8 %"'de38" = alloca float, align 4 %242 = getelementptr float, float* %"'de38", i64 0 store float 0.000000e+00, float* %242, align 4 %"arrayref'de" = alloca float, align 4 %243 = getelementptr float, float* %"arrayref'de", i64 0 store float 0.000000e+00, float* %243, align 4 %"arrayref70'de" = alloca float, align 4 %244 = getelementptr float, float* %"arrayref70'de", i64 0 store float 0.000000e+00, float* %244, align 4 %arraylen64.pre_cache = alloca i64*, align 8 %arraysize54.pre_cache = alloca i64*, align 8 %arraysize5_cache = alloca i64, align 8 inverttop: ; preds = %invertL7 fence syncscope("singlethread") seq_cst fence syncscope("singlethread") seq_cst ret void invertL7: ; preds = %invertL40, %invertL28 br label %inverttop invertL28: ; preds = %invertL40 br label %invertL7 invertL36: ; No predecessors! invertL40: ; preds = %invertL221, %invertL62 %245 = load i64, i64* %arraysize_cache, align 8, !dbg !4420, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428, !invariant.group !4430 %_unwrap = icmp eq i64 %arraylen3, %245 %_unwrap28 = icmp eq i64 %245, 1 %value_phi_unwrap = or i1 %_unwrap, %_unwrap28 br i1 %value_phi_unwrap, label %invertL7, label %invertL28 invertL62: ; No predecessors! br label %invertL40 invertL86: ; preds = %invertL194 %246 = load i64, i64* %"iv17'ac", align 8, !dbg !4488 %"arrayptr_ptr.phi.trans.insert'ipg_unwrap" = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %"'ipc29", i64 0, i32 0, !dbg !4488 %"arrayptr75'il_phi_unwrap" = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %"arrayptr_ptr.phi.trans.insert'ipg_unwrap", align 16, !dbg !4488, !tbaa !68, !alias.scope !4490, !noalias !4491, !nonnull !63 %"'ipc_unwrap" = bitcast i8 addrspace(13)* %"arrayptr75'il_phi_unwrap" to float addrspace(13)*, !dbg !4488 %247 = load i64*, i64** %arraysize72_cache, align 8, !dbg !4488, !dereferenceable !880, !invariant.group !4495 %248 = getelementptr inbounds i64, i64* %247, i64 %246, !dbg !4488 %249 = load i64, i64* %248, align 8, !dbg !4488, !tbaa !68, !range !253, !alias.scope !336, !noalias !337, !invariant.group !4496 %250 = load i64*, i64** %value_phi41_cache, align 8, !dbg !4488, !dereferenceable !880, !invariant.group !4466 %251 = getelementptr inbounds i64, i64* %250, i64 %246, !dbg !4488 %252 = load i64, i64* %251, align 8, !dbg !4488, !invariant.group !4467 %value_phi41.op_unwrap = add i64 %252, -1, !dbg !4488 %_unwrap31 = mul i64 %249, %value_phi41.op_unwrap, !dbg !4488 %253 = load i64*, i64** %value_phi40_cache, align 8, !dbg !4488, !dereferenceable !880, !invariant.group !4464 %254 = getelementptr inbounds i64, i64* %253, i64 %246, !dbg !4488 %255 = load i64, i64* %254, align 8, !dbg !4488, !invariant.group !4465 %value_phi40.op_unwrap = add i64 %255, -1, !dbg !4488 %_unwrap33 = add i64 %_unwrap31, %value_phi40.op_unwrap, !dbg !4488 %"'ipg_unwrap" = getelementptr inbounds float, float addrspace(13)* %"'ipc_unwrap", i64 %_unwrap33, !dbg !4488 %256 = load float, float addrspace(13)* %"'ipg_unwrap", align 4, !dbg !4488, !tbaa !494, !alias.scope !4943, !noalias !4946 store float 0.000000e+00, float addrspace(13)* %"'ipg_unwrap", align 4, !dbg !4488, !tbaa !494, !alias.scope !4943, !noalias !4946 %257 = load float, float* %"'de", align 4, !dbg !4488 %258 = fadd fast float %257, %256, !dbg !4488 store float %258, float* %"'de", align 4, !dbg !4488 %259 = load i64, i64* %"iv17'ac", align 8, !dbg !4483 %260 = load float*, float** %_cache, align 8, !dbg !4483, !dereferenceable !880, !invariant.group !4497 %261 = getelementptr inbounds float, float* %260, i64 %259, !dbg !4483 %262 = load float, float* %261, align 4, !dbg !4483, !invariant.group !4498 %263 = load float, float* %"'de", align 4, !dbg !4483 %264 = call fastcc { float } @diffejulia_gelu_2643(float %262, float %263), !dbg !4483 %265 = extractvalue { float } %264, 0, !dbg !4483 %266 = load float, float* %"'de38", align 4, !dbg !4483 %267 = fadd fast float %266, %265, !dbg !4483 store float %267, float* %"'de38", align 4, !dbg !4483 store float 0.000000e+00, float* %"'de", align 4, !dbg !4483 %268 = load float, float* %"'de38", align 4, !dbg !4481 store float 0.000000e+00, float* %"'de38", align 4, !dbg !4481 %269 = load float, float* %"arrayref'de", align 4, !dbg !4481 %270 = fadd fast float %269, %268, !dbg !4481 store float %270, float* %"arrayref'de", align 4, !dbg !4481 %271 = load float, float* %"arrayref70'de", align 4, !dbg !4481 %272 = fadd fast float %271, %268, !dbg !4481 store float %272, float* %"arrayref70'de", align 4, !dbg !4481 %273 = load float, float* %"arrayref70'de", align 4, !dbg !4477 store float 0.000000e+00, float* %"arrayref70'de", align 4, !dbg !4477 %274 = load i64, i64* %"iv17'ac", align 8, !dbg !4477 %"'ipc40_unwrap" = addrspacecast {} addrspace(10)* %"'1" to float addrspace(13)* addrspace(11)*, !dbg !4477 %"arrayptr69538'il_phi_unwrap" = load float addrspace(13)*, float addrspace(13)* addrspace(11)* %"'ipc40_unwrap", align 16, !dbg !4477, !tbaa !271, !alias.scope !4479, !noalias !4480, !nonnull !63 %275 = icmp ne i64 %274, 0, !dbg !4477 br i1 %275, label %invertL86_phirc, label %invertL86_phirc42, !dbg !4477 invertL86_phirc: ; preds = %invertL86 %276 = sub nuw i64 %274, 1 %277 = load i64*, i64** %arraylen64.pre_cache, align 8, !dereferenceable !880, !invariant.group !4939 %278 = getelementptr inbounds i64, i64* %277, i64 %276 %279 = load i64, i64* %278, align 8, !dbg !4447, !tbaa !250, !range !253, !alias.scope !254, !noalias !255, !invariant.group !4940 br label %invertL86_phimerge invertL86_phirc42: ; preds = %invertL86 br label %invertL86_phimerge invertL86_phimerge: ; preds = %invertL86_phirc42, %invertL86_phirc %280 = phi i64 [ %279, %invertL86_phirc ], [ %arraylen3, %invertL86_phirc42 ], !dbg !4477 %.not536_unwrap = icmp eq i64 %280, 1, !dbg !4477 %_unwrap43 = select i1 %.not536_unwrap, i64 0, i64 %value_phi40.op_unwrap, !dbg !4477 %"'ipg39_unwrap" = getelementptr inbounds float, float addrspace(13)* %"arrayptr69538'il_phi_unwrap", i64 %_unwrap43, !dbg !4477 %281 = atomicrmw fadd float addrspace(13)* %"'ipg39_unwrap", float %273 monotonic, align 4, !dbg !4477 %282 = load float, float* %"arrayref'de", align 4, !dbg !4456 store float 0.000000e+00, float* %"arrayref'de", align 4, !dbg !4456 %283 = load i64, i64* %"iv17'ac", align 8, !dbg !4456 %"'ipc47_unwrap" = bitcast {} addrspace(10)* %"'" to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !4456 %"'ipc48_unwrap" = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %"'ipc47_unwrap" to i8 addrspace(13)* addrspace(10)*, !dbg !4456 %"'ipc49_unwrap" = bitcast i8 addrspace(13)* addrspace(10)* %"'ipc48_unwrap" to {} addrspace(10)*, !dbg !4456 %284 = icmp ne i64 %283, 0, !dbg !4456 br i1 %284, label %invertL86_phimerge_phirc, label %invertL86_phimerge_phirc55, !dbg !4456 invertL86_phimerge_phirc: ; preds = %invertL86_phimerge %285 = sub nuw i64 %283, 1 %"'ipc50_unwrap" = bitcast {} addrspace(10)* %"'" to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %"'ipc51_unwrap" = bitcast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %"'ipc50_unwrap" to i8 addrspace(13)* addrspace(10)* %"'ipc52_unwrap" = bitcast i8 addrspace(13)* addrspace(10)* %"'ipc51_unwrap" to {} addrspace(10)* br label %invertL86_phimerge_phimerge invertL86_phimerge_phirc55: ; preds = %invertL86_phimerge br label %invertL86_phimerge_phimerge invertL86_phimerge_phimerge: ; preds = %invertL86_phimerge_phirc55, %invertL86_phimerge_phirc %286 = phi {} addrspace(10)* [ %"'ipc52_unwrap", %invertL86_phimerge_phirc ], [ %"'ipc49_unwrap", %invertL86_phimerge_phirc55 ], !dbg !4456 %"'ipc53_unwrap" = bitcast {} addrspace(10)* %286 to i8 addrspace(13)* addrspace(10)*, !dbg !4456 %"'ipc54_unwrap" = addrspacecast i8 addrspace(13)* addrspace(10)* %"'ipc53_unwrap" to i8 addrspace(13)* addrspace(11)*, !dbg !4456 %"'il_phi_unwrap" = load i8 addrspace(13)*, i8 addrspace(13)* addrspace(11)* %"'ipc54_unwrap", align 8, !dbg !4462, !alias.scope !4468, !noalias !4471 %"'ipc45_unwrap" = bitcast i8 addrspace(13)* %"'il_phi_unwrap" to float addrspace(13)*, !dbg !4456 %287 = icmp ne i64 %283, 0, !dbg !4456 br i1 %287, label %invertL86_phimerge_phimerge_phirc, label %invertL86_phimerge_phimerge_phirc57, !dbg !4456 invertL86_phimerge_phimerge_phirc: ; preds = %invertL86_phimerge_phimerge %288 = sub nuw i64 %283, 1 %289 = load i64*, i64** %arraysize54.pre_cache, align 8, !dereferenceable !880, !invariant.group !4941 %290 = getelementptr inbounds i64, i64* %289, i64 %288 %291 = load i64, i64* %290, align 8, !dbg !4459, !tbaa !68, !range !253, !alias.scope !336, !noalias !337, !invariant.group !4942 br label %invertL86_phimerge_phimerge_phimerge invertL86_phimerge_phimerge_phirc57: ; preds = %invertL86_phimerge_phimerge %292 = load i64, i64* %arraysize5_cache, align 8, !dbg !4420, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428, !invariant.group !4437 br label %invertL86_phimerge_phimerge_phimerge invertL86_phimerge_phimerge_phimerge: ; preds = %invertL86_phimerge_phimerge_phirc57, %invertL86_phimerge_phimerge_phirc %293 = phi i64 [ %291, %invertL86_phimerge_phimerge_phirc ], [ %292, %invertL86_phimerge_phimerge_phirc57 ], !dbg !4456 %.not535_unwrap = icmp eq i64 %293, 1, !dbg !4456 %_unwrap58 = select i1 %.not535_unwrap, i64 0, i64 %value_phi41.op_unwrap, !dbg !4456 %294 = icmp ne i64 %283, 0, !dbg !4456 br i1 %294, label %invertL86_phimerge_phimerge_phimerge_phirc, label %invertL86_phimerge_phimerge_phimerge_phirc59, !dbg !4456 invertL86_phimerge_phimerge_phimerge_phirc: ; preds = %invertL86_phimerge_phimerge_phimerge %295 = sub nuw i64 %283, 1 %296 = load i64*, i64** %arraysize72_cache, align 8, !dereferenceable !880, !invariant.group !4495 %297 = getelementptr inbounds i64, i64* %296, i64 %295 %298 = load i64, i64* %297, align 8, !dbg !4488, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428, !invariant.group !4496 br label %invertL86_phimerge_phimerge_phimerge_phimerge invertL86_phimerge_phimerge_phimerge_phirc59: ; preds = %invertL86_phimerge_phimerge_phimerge %299 = load i64, i64* %arraysize_cache, align 8, !dbg !4420, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428, !invariant.group !4430 br label %invertL86_phimerge_phimerge_phimerge_phimerge invertL86_phimerge_phimerge_phimerge_phimerge: ; preds = %invertL86_phimerge_phimerge_phimerge_phirc59, %invertL86_phimerge_phimerge_phimerge_phirc %300 = phi i64 [ %298, %invertL86_phimerge_phimerge_phimerge_phirc ], [ %299, %invertL86_phimerge_phimerge_phimerge_phirc59 ], !dbg !4456 %_unwrap60 = mul i64 %_unwrap58, %300, !dbg !4456 %.not534_unwrap = icmp eq i64 %300, 1, !dbg !4456 %_unwrap61 = select i1 %.not534_unwrap, i64 0, i64 %value_phi40.op_unwrap, !dbg !4456 %_unwrap62 = add i64 %_unwrap60, %_unwrap61, !dbg !4456 %"'ipg46_unwrap" = getelementptr inbounds float, float addrspace(13)* %"'ipc45_unwrap", i64 %_unwrap62, !dbg !4456 %301 = atomicrmw fadd float addrspace(13)* %"'ipg46_unwrap", float %282 monotonic, align 4, !dbg !4456 %302 = load i64, i64* %"iv17'ac", align 8 %303 = icmp eq i64 %302, 0 %304 = xor i1 %303, true br i1 %303, label %invertguard_exit374, label %incinvertL86 incinvertL86: ; preds = %invertL86_phimerge_phimerge_phimerge_phimerge %305 = load i64, i64* %"iv17'ac", align 8 %306 = add nsw i64 %305, -1 store i64 %306, i64* %"iv17'ac", align 8 br label %invertguard_exit379 invertL194: ; No predecessors! br label %invertL86 invertL221: ; preds = %invertL227 br label %invertL40 invertL227: ; preds = %invertL235 br label %invertL221 invertL235: ; preds = %invertL245 br label %invertL227 invertL245: ; preds = %invertL258 br label %invertL235 invertL258: ; preds = %invertL261 br label %invertL245 invertL261: ; preds = %mergeinvertL261_L282, %incinvertL261 %307 = load i64, i64* %"iv'ac", align 8 %308 = icmp eq i64 %307, 0 %309 = xor i1 %308, true br i1 %308, label %invertL258, label %incinvertL261 incinvertL261: ; preds = %invertL261 %310 = load i64, i64* %"iv'ac", align 8 %311 = add nsw i64 %310, -1 store i64 %311, i64* %"iv'ac", align 8 br label %invertL261 invertL282: ; No predecessors! br label %mergeinvertL261_L282 mergeinvertL261_L282: ; preds = %invertL282 store i64 0, i64* %"iv'ac", align 8 br label %invertL261 invertL293: ; No predecessors! %312 = call i64 @julia_nthreads_2651() #78, !dbg !4438 %313 = load i64, i64* %arraysize5_cache, align 8, !dbg !4420, !tbaa !68, !range !253, !alias.scope !4425, !noalias !4428, !invariant.group !4437 %314 = call i64 @llvm.smin.i64(i64 %312, i64 %313) #78, !dbg !4515 %_unwrap82 = trunc i64 %314 to i32 %_unwrap83 = add i32 %_unwrap82, -1 invertL361.lr.ph: ; No predecessors! invertL427.preheader: ; No predecessors! invertL430.lr.ph: ; No predecessors! invertL361: ; No predecessors! invertL412: ; No predecessors! invertL415: ; No predecessors! invertL641.preheader.loopexit: ; No predecessors! invertL641.preheader: ; No predecessors! invertL646.preheader: ; No predecessors! invertL430: ; No predecessors! invertL442.preheader: ; No predecessors! invertL442: ; No predecessors! invertL622.loopexit: ; No predecessors! invertL622: ; No predecessors! invertL646: ; No predecessors! invertL666.preheader: ; No predecessors! invertL666: ; No predecessors! invertL670: ; No predecessors! invertL673: ; No predecessors! invertL676.loopexit: ; No predecessors! invertL676: ; No predecessors! invertL678.loopexit: ; No predecessors! invertL678: ; No predecessors! invertL691.lr.ph: ; No predecessors! invertL691: ; No predecessors! invertL691.L703_crit_edge: ; No predecessors! invertL703: ; No predecessors! invertL804.loopexit: ; No predecessors! invertL804: ; No predecessors! invertL811: ; No predecessors! invertL860: ; No predecessors! invertL902: ; No predecessors! invertL924: ; No predecessors! invertL929: ; No predecessors! invertL954.preheader: ; No predecessors! invertL954: ; No predecessors! invertL963.lr.ph: ; No predecessors! invertL963: ; No predecessors! invertL1005.loopexit: ; No predecessors! invertL1005: ; No predecessors! invertL1042.loopexit: ; No predecessors! invertL1042.loopexit1: ; No predecessors! invertL1042.loopexit2: ; No predecessors! invertL1042: ; preds = %L1042 invertguard_exit374: ; preds = %invertL86_phimerge_phimerge_phimerge_phimerge %315 = load i64, i64* %"iv17'ac", align 8 %forfree = load i64*, i64** %arraysize72_cache, align 8, !dereferenceable !880, !invariant.group !4495 %316 = bitcast i64* %forfree to i8* call void @free(i8* nonnull %316), !dbg !4948 %317 = load i64, i64* %"iv17'ac", align 8 %forfree30 = load i64*, i64** %value_phi41_cache, align 8, !dereferenceable !880, !invariant.group !4466 %318 = bitcast i64* %forfree30 to i8* call void @free(i8* nonnull %318), !dbg !4948 %319 = load i64, i64* %"iv17'ac", align 8 %forfree32 = load i64*, i64** %value_phi40_cache, align 8, !dereferenceable !880, !invariant.group !4464 %320 = bitcast i64* %forfree32 to i8* call void @free(i8* nonnull %320), !dbg !4948 %321 = load i64, i64* %"iv17'ac", align 8 %forfree36 = load float*, float** %_cache, align 8, !dereferenceable !4949, !invariant.group !4497 %322 = bitcast float* %forfree36 to i8* call void @free(i8* nonnull %322), !dbg !4948 %323 = load i64, i64* %"iv17'ac", align 8 %forfree41 = load i64*, i64** %arraylen64.pre_cache, align 8, !dereferenceable !880, !invariant.group !4939 %324 = bitcast i64* %forfree41 to i8* call void @free(i8* nonnull %324), !dbg !4948 %325 = load i64, i64* %"iv17'ac", align 8 %forfree56 = load i64*, i64** %arraysize54.pre_cache, align 8, !dereferenceable !880, !invariant.group !4941 %326 = bitcast i64* %forfree56 to i8* call void @free(i8* nonnull %326), !dbg !4948 invertguard_exit379: ; preds = %incinvertL86 } %v.i_replacementA = phi i64 , !dbg !293 julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:3791: bool GradientUtils::legalRecompute(const llvm::Value*, const ValueToValueMapTy&, llvm::IRBuilder<>*, bool, bool) const: Assertion `phi->getNumIncomingValues() != 0' failed. [836700] signal (6.-6): Aborted in expression starting at REPL[9]:1 unknown function (ip: 0x7d126491d32c) gsignal at /usr/lib/libc.so.6 (unknown line) abort at /usr/lib/libc.so.6 (unknown line) unknown function (ip: 0x7d12648b43db) __assert_fail at /usr/lib/libc.so.6 (unknown line) legalRecompute at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:3791 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6535 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:1327 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:930 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:1066 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 unwrapM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:1088 lookupM at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:6537 branchToCorrespondingTarget at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:7738 createInvertedTerminator at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:3611 CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:4382 recursivelyHandleSubfunction at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:5744 visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:6611 visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:111 [inlined] CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:4378 EnzymeCreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/CApi.cpp:615 EnzymeCreatePrimalAndGradient at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/api.jl:154 unknown function (ip: 0x7d12341e7f6b) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 enzyme! at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:3147 unknown function (ip: 0x7d12341e3828) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 #codegen#487 at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5022 codegen at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:4444 [inlined] _thunk at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5707 _thunk at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5707 [inlined] cached_compilation at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5741 [inlined] #532 at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5807 #JuliaContext#149 at /home/avikpal/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:52 unknown function (ip: 0x7d123414fe06) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 JuliaContext at /home/avikpal/.julia/packages/GPUCompiler/kqxyC/src/driver.jl:42 #s1946#531 at /home/avikpal/.julia/packages/Enzyme/wOi4l/src/compiler.jl:5759 [inlined] #s1946#531 at ./none:0 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 GeneratedFunctionStub at ./boot.jl:602 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_call_staged at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/method.c:540 ijl_code_for_staged at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/method.c:593 get_staged at ./compiler/utilities.jl:123 retrieve_code_info at ./compiler/utilities.jl:135 [inlined] InferenceState at ./compiler/inferencestate.jl:430 typeinf_edge at ./compiler/typeinfer.jl:920 abstract_call_method at ./compiler/abstractinterpretation.jl:629 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:95 abstract_call_known at ./compiler/abstractinterpretation.jl:2087 abstract_call at ./compiler/abstractinterpretation.jl:2169 abstract_call at ./compiler/abstractinterpretation.jl:2162 abstract_call at ./compiler/abstractinterpretation.jl:2354 abstract_eval_call at ./compiler/abstractinterpretation.jl:2370 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2380 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2624 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:2889 typeinf_local at ./compiler/abstractinterpretation.jl:3098 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3186 _typeinf at ./compiler/typeinfer.jl:247 typeinf at ./compiler/typeinfer.jl:216 typeinf_edge at ./compiler/typeinfer.jl:930 abstract_call_method at ./compiler/abstractinterpretation.jl:629 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:95 abstract_call_known at ./compiler/abstractinterpretation.jl:2087 abstract_call at ./compiler/abstractinterpretation.jl:2169 abstract_apply at ./compiler/abstractinterpretation.jl:1612 abstract_call_known at ./compiler/abstractinterpretation.jl:2004 abstract_call at ./compiler/abstractinterpretation.jl:2169 abstract_call at ./compiler/abstractinterpretation.jl:2162 abstract_call at ./compiler/abstractinterpretation.jl:2354 abstract_eval_call at ./compiler/abstractinterpretation.jl:2370 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2380 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2624 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:2913 typeinf_local at ./compiler/abstractinterpretation.jl:3098 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3186 _typeinf at ./compiler/typeinfer.jl:247 typeinf at ./compiler/typeinfer.jl:216 typeinf_ext at ./compiler/typeinfer.jl:1051 typeinf_ext_toplevel at ./compiler/typeinfer.jl:1082 typeinf_ext_toplevel at ./compiler/typeinfer.jl:1078 jfptr_typeinf_ext_toplevel_35682.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] jl_type_infer at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:394 jl_generate_fptr_impl at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/jitlayers.cpp:504 jl_compile_method_internal at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2481 [inlined] jl_compile_method_internal at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2368 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2887 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] do_call at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:126 eval_value at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:223 eval_stmt_value at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:174 [inlined] eval_body at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:617 jl_interpret_toplevel_thunk at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:775 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:934 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:877 eval_body at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:579 eval_body at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:544 jl_interpret_toplevel_thunk at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/interpreter.c:775 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:934 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:877 jl_toplevel_eval_flex at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:877 ijl_toplevel_eval_in at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/toplevel.c:985 eval at ./boot.jl:385 [inlined] eval_user_input at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:150 repl_backend_loop at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:246 #start_repl_backend#46 at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231 start_repl_backend at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:228 _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 #run_repl#59 at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:389 run_repl at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:375 jfptr_run_repl_91734.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 #1013 at ./client.jl:432 jfptr_YY.1013_82700.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] jl_f__call_latest at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/builtins.c:812 #invokelatest#2 at ./essentials.jl:892 [inlined] invokelatest at ./essentials.jl:889 [inlined] run_main_repl at ./client.jl:416 exec_options at ./client.jl:333 _start at ./client.jl:552 jfptr__start_82726.1 at /home/avikpal/.julia/juliaup/julia-1.10.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] true_main at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/jlapi.c:582 jl_repl_entrypoint at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/src/jlapi.c:731 main at /cache/build/builder-amdci4-2/julialang/julia-release-1-dot-10/cli/loader_exe.c:58 unknown function (ip: 0x7d12648b5ccf) __libc_start_main at /usr/lib/libc.so.6 (unknown line) unknown function (ip: 0x4010b8) Allocations: 41003533 (Pool: 40954080; Big: 49453); GC: 48 ```
avik-pal commented 6 months ago

@wsmoses added the crash logs. the one with LLVM error seems to be too long and I can't seem to figure out how to redirect the logs when julia crashes.

wsmoses commented 6 months ago

@avik-pal I need to double check, but I think the latter one is actually a bug in Polyester. It emits a gc preserve begin without a gc preserve end.

I've also been warned that polyester messes with LLVM in ways that generates invalid code (which is indeed the case here).

cc @vchuravy

wsmoses commented 6 months ago

also @avik-pal

julia> act = gelu
ERROR: UndefVarError: `gelu` not defined
Stacktrace:
 [1] top-level scope
   @ REPL[4]:1
wsmoses commented 6 months ago

Okay I have confirmed the latter to unquestionably be a bug in polyester, not Enzyme.

polyester.ll.txt

Speciically I did

julia> @code_llvm optimize=false raw=true loss_function(act, y, b)

You will see that there is a gc preserve begin

        %380 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* %379, [2 x {} addrspace(10)*] %357), !dbg !473

which is never used and has no gc_preserve_end

wsmoses commented 6 months ago

Posted here https://github.com/JuliaSIMD/Polyester.jl/issues/145

wsmoses commented 6 months ago

@avik-pal given that this issue is premised on invalid LLVM to begin with, I'm going to close.

Please reopen if I'm mistaken and it can be reproduced without the invalid LLVM.