Closed aterenin closed 6 months ago
Some more info.
2*pi
plays no role, same problem if replaced with 1.0f0
.(x -> mod.(x, 1.0f0)|>sum)(CuArrays.randn(10,10))
works without issue, so autodiff does play a role here.Adjoint code output below.
julia> Zygote.@code_adjoint (x -> mod.(x, 1.0f0)|>sum)(CuArrays.randn(10,10))
Zygote.Adjoint(1: (%3, %4 :: Zygote.Context, %1, %2)
%5 = Zygote._pullback(%4, Base.broadcasted, Main.mod, %2, 1.0f0)
%6 = Base.getindex(%5, 1)
%7 = Base.getindex(%5, 2)
%8 = Zygote._pullback(%4, Base.materialize, %6)
%9 = Base.getindex(%8, 1)
%10 = Base.getindex(%8, 2)
%11 = Zygote._pullback(%4, Main.:|>, %9, Main.sum)
%12 = Base.getindex(%11, 1)
%13 = Base.getindex(%11, 2)
return %12, 1: (%1)
%2 = (@13)(%1)
%3 = Zygote.gradindex(%2, 2)
%4 = (@10)(%3)
%5 = Zygote.gradindex(%4, 2)
%6 = (@7)(%5)
%7 = Zygote.gradindex(%6, 3)
%8 = Zygote.tuple(nothing, %7)
return %8)```
Here's a much smaller MWE based on the above adjoint code.
v2 = CuArrays.randn(10,10)
v4 = Zygote.Context()
v5 = Zygote._pullback(v4, Base.broadcasted, Main.mod, v2, 1.0f0)
Here's the code Zygote generates. It's not very easy for me to see what is actually going on here.
julia> @code_typed Zygote._pullback(v4, Base.broadcasted, Main.mod, v2, 1.0f0)
CodeInfo(
1 ─ %1 = Base.getfield(args, 2)::CuArray{Float32,2,CuArray{Float32,1,Nothing}}
│ %2 = Base.getfield(args, 3)::Float32
└── goto JuliaGPU/CuArrays.jl#3
2 ─ $(Expr(:meta, :inline))
3 ┄ goto JuliaGPU/CuArrays.jl#5
4 ─ $(Expr(:meta, :inline))
5 ┄ %7 = Core.tuple(%1, %2)::Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32}
│ %8 = Core.tuple(%1, %2)::Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32}
│ %9 = Base.getfield(%1, :dims)::Tuple{Int64,Int64}
│ %10 = Base.getfield(%9, 1, true)::Int64
│ %11 = Base.slt_int(%10, 0)::Bool
│ %12 = Base.ifelse(%11, 0, %10)::Int64
│ %13 = %new(Base.OneTo{Int64}, %12)::Base.OneTo{Int64}
│ %14 = Base.getfield(%9, 2, true)::Int64
│ %15 = Base.slt_int(%14, 0)::Bool
│ %16 = Base.ifelse(%15, 0, %14)::Int64
│ %17 = %new(Base.OneTo{Int64}, %16)::Base.OneTo{Int64}
│ %18 = Core.tuple(%13, %17)::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}
│ %19 = %new(Base.Broadcast.Broadcasted{CuArrays.CuArrayStyle{2},Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1692#1695"{typeof(mod)},Tuple{CuArray{Float32,2,CuArray{Float32,1,
Nothing}},Float32}}, Zygote.var"#1692#1695"{typeof(mod)}(mod), %8, %18)::Base.Broadcast.Broadcasted{CuArrays.CuArrayStyle{2},Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1692#169
5"{typeof(mod)},Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32}}
│ %20 = invoke Base.Broadcast.copy(%19::Base.Broadcast.Broadcasted{CuArrays.CuArrayStyle{2},Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1692#1695"{typeof(mod)},Tuple{CuArray{F
loat32,2,CuArray{Float32,1,Nothing}},Float32}})::CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}
│ %21 = Core.tuple(%20)::Tuple{CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}
│ %22 = Base.getfield(%20, :dims)::Tuple{Int64,Int64}
│ %23 = Base.getfield(%22, 1, true)::Int64
│ %24 = Base.slt_int(%23, 0)::Bool
│ %25 = Base.ifelse(%24, 0, %23)::Int64
│ %26 = %new(Base.OneTo{Int64}, %25)::Base.OneTo{Int64}
│ %27 = Base.getfield(%22, 2, true)::Int64
│ %28 = Base.slt_int(%27, 0)::Bool
│ %29 = Base.ifelse(%28, 0, %27)::Int64
│ %30 = %new(Base.OneTo{Int64}, %29)::Base.OneTo{Int64}
│ %31 = Core.tuple(%26, %30)::Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}
│ %32 = %new(Base.Broadcast.Broadcasted{CuArrays.CuArrayStyle{2},Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1699#1703",Tuple{CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Not
hing}}}, Zygote.var"#1699#1703"(), %21, %31)::Base.Broadcast.Broadcasted{CuArrays.CuArrayStyle{2},Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1699#1703",Tuple{CuArray{ForwardDif
f.Dual{Nothing,Float64,2},2,Nothing}}}
│ %33 = invoke Base.Broadcast.copy(%32::Base.Broadcast.Broadcasted{CuArrays.CuArrayStyle{2},Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1699#1703",Tuple{CuArray{ForwardDiff.Du
al{Nothing,Float64,2},2,Nothing}}})::CuArray{Float64,2,Nothing}
│ %34 = %new(Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}, %7, %20)::Zygote.var"#_back#170
4"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}
│ %35 = %new(Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}, %34):
:Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}
│ %36 = %new(Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64
,2},2,Nothing}}}}, %35)::Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothi
ng,Float64,2},2,Nothing}}}}
│ %37 = %new(Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{Forwar
dDiff.Dual{Nothing,Float64,2},2,Nothing}}}}}, %36)::Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float3
2,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}}}
│ %38 = %new(Zygote.var"#173#174"{Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Fl
oat32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}, %37, ((nothing, nothing, nothing, nothing), ()))::Zygote.var"#173#174"{Zygote.var"#72#b
ack#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2
,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}
│ %39 = %new(Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuAr
ray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}}, %38)::Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.
var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Flo
at64,2},2,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}}
│ %40 = Base.tuple($(QuoteNode(∂(broadcastable))), Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}}(Zygote.var"#173#174"{Zygo
te.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}(Zygote.var"#1611#1613"(), ((nothing, nothing), ()))), Zygote.var"#237#back#127"{typeof(identity)}(identity), %39, Zygote.var"#237#
back#127"{typeof(identity)}(identity), $(QuoteNode(∂(broadcastable))), Zygote.var"#3039#back#1181"{Zygote.var"#1174#1178"}(Zygote.var"#1174#1178"()))::Core.Compiler.PartialStruct(Tuple{typ
eof(∂(broadcastable)),Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}},Zygote.var"#237#back#127"{typeof(identity)},Zygote.var"#
334#back#175"{Zygote.var"#173#174"{Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Flo
at32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}},Zygote.var"#237#back#127"{typeof(identity)},typeof(∂(broadcastable)),Zygote.var"#3039#ba
ck#1181"{Zygote.var"#1174#1178"}}, Any[Core.Compiler.Const(∂(broadcastable), false), Core.Compiler.Const(Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{N
othing,Nothing},Tuple{}}}}(Zygote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}(Zygote.var"#1611#1613"(), ((nothing, nothing), ()))), false), Core.Compiler.Co
nst(Zygote.var"#237#back#127"{typeof(identity)}(identity), false), Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,
Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}}, Core.Com
piler.Const(Zygote.var"#237#back#127"{typeof(identity)}(identity), false), Core.Compiler.Const(∂(broadcastable), false), Core.Compiler.Const(Zygote.var"#3039#back#1181"{Zygote.var"#1174#11
78"}(Zygote.var"#1174#1178"()), false)])
│ %41 = %new(typeof(∂(broadcasted)), %40)::typeof(∂(broadcasted))
│ %42 = Base.tuple(%33, %41)::Core.Compiler.PartialStruct(Tuple{CuArray{Float64,2,Nothing},typeof(∂(broadcasted))}, Any[CuArray{Float64,2,Nothing}, Core.Compiler.PartialStruct(typeof(∂(b
roadcasted)), Any[Core.Compiler.PartialStruct(Tuple{typeof(∂(broadcastable)),Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}},Z
ygote.var"#237#back#127"{typeof(identity)},Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#72#back#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{
Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,Nothing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}},Zygote.var"#237#back#127"{typeof(
identity)},typeof(∂(broadcastable)),Zygote.var"#3039#back#1181"{Zygote.var"#1174#1178"}}, Any[Core.Compiler.Const(∂(broadcastable), false), Core.Compiler.Const(Zygote.var"#334#back#175"{Zy
gote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}}(Zygote.var"#173#174"{Zygote.var"#1611#1613",Tuple{Tuple{Nothing,Nothing},Tuple{}}}(Zygote.var"#1611#1613"(
), ((nothing, nothing), ()))), false), Core.Compiler.Const(Zygote.var"#237#back#127"{typeof(identity)}(identity), false), Zygote.var"#334#back#175"{Zygote.var"#173#174"{Zygote.var"#72#back
#1874"{Zygote.var"#1868#1873"{Zygote.var"#back#1706"{2,Zygote.var"#_back#1704"{Tuple{CuArray{Float32,2,CuArray{Float32,1,Nothing}},Float32},CuArray{ForwardDiff.Dual{Nothing,Float64,2},2,No
thing}}}}},Tuple{NTuple{4,Nothing},Tuple{}}}}, Core.Compiler.Const(Zygote.var"#237#back#127"{typeof(identity)}(identity), false), Core.Compiler.Const(∂(broadcastable), false), Core.Compile
r.Const(Zygote.var"#3039#back#1181"{Zygote.var"#1174#1178"}(Zygote.var"#1174#1178"()), false)])])])
└── return %42
6 ─ $(Expr(:meta, :inline))
) => Tuple{CuArray{Float64,2,Nothing},typeof(∂(broadcasted))}
Does this only happen with CuArrays 2, or does it also happen with CuArrays 1.7?
I'm not sure. But defining the following adjoint works around the crash.
@adjoint broadcasted(::typeof(mod), x::Numeric, y::Numeric) = mod.(x,y), Δ -> (nothing, Δ, .-floor.(x./y).*Δ)
I get the following:
julia> v5 = Zygote._pullback(v4, Base.broadcasted, Main.mod, v2, 1.0f0)
ERROR: GPU compilation of broadcast(CuArrays.CuKernelContext, CUDAnative.CuDeviceArray{Tuple{Float32,Zygote.var"#1611#back#614"{Zygote.var"#612#613"{Float32,Float32}}},2,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1666#1673"{Zygote.Context,typeof(mod)},Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}},Float32}}) failed
KernelError: passing and using non-bitstype argument
Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},Zygote.var"#1666#1673"{Zygote.Context,typeof(mod)},Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}},Float32}}.
That type is not isbits, and such arguments are only allowed when they are unused by the kernel. .f is of type Zygote.var"#1666#1673"{Zygote.Context,typeof(mod)} which is not isbits.
.__context__ is of type Zygote.Context which is not isbits.
.cache is of type Union{Nothing, IdDict{Any,Any}} which is not isbits.
I'm not sure how this gets past the validator in your case, but accessing non-isbits data like that (which is passed by pointer) will result in CPU pointers getting used on the GPU, resulting in illegal memory accesses.
I don't quite follow - I've generally seen that error when accidentally passing an Array
instead of a CuArray
, for instance CuArrays.randn(5,5) .+ randn(1,1)
. But this isn't what we're doing here: the value 1.0f0
is a scalar value, not a pointer to an array. Maybe the broadcasting machinery is somehow erroneously turning it into one?
No, this is the mod
function that gets closed over by Zygote to include a non-isbits cache: Zygote.var"#1666#1673"{Zygote.Context,typeof(mod)}
Doesn't only apply to arrays.
I can reproduce the issue on CuArrays 1.7
No, this is the
mod
function that gets closed over by Zygote to include a non-isbits cache:Zygote.var"#1666#1673"{Zygote.Context,typeof(mod)}
Doesn't only apply to arrays.
Ah I see, thanks. This also explains why I was getting different types than Zygote in my output when I was trying to reproduce this issue. On the other hand, why does Zygote need a cache in this case?
cc @MikeInnes
Going to close this as stale. Feel free to open a new issue if the problem still exists.
Describe the bug Gradients for modular arithmetic trigger an illegal address error. No issues with any other type of broadcasting at present.
To Reproduce
Expected behavior No error.
Build log
I ran
]build CuArrays
but this only produced output for NNlib.Environment details (please complete this section) Details on Julia:
Julia packages:
The Zygote branch fixes broadcasting and is from this PR: https://github.com/FluxML/Zygote.jl/pull/565.
CUDA: toolkit and driver version:
Additional context
error in running finalizer: CUDAdrv.CuError(code=CUDAdrv.cudaError_enum(0x000002bc), meta=nothing)