Closed MilesCranmer closed 1 month ago
Since you're modifying x in place, you also need to pass a shadow x, which the simpler gradient wrapper doesn't support.
Since the gradient wrapper you're using uses vector mode, I've also done so below:
values(only(autodiff(Forward, (x,p) -> sum(f(x, p)), BatchDuplicatedNoNeed, BatchDuplicated(x, ((zero(x) for _ in 1:length(params))...,)), BatchDuplicated(params, onehot(params)))))
You could also do the same by using a Duplicated closure.
Thanks!
Since you're modifying x in place, you also need to pass a shadow x, which the simpler gradient wrapper doesn't support.
Not sure I understand this point. I understand that this:
@. x = ...
is modifying x in-place, but isn't this:
x = @. ...
not modifying x in-place? (i.e., the @.
should only apply to what comes after). The function I pasted above uses the second one.
By the way, sorry to stay on the same issue for a follow-up question, but how would I modify your code to work for Reverse
? It doesn't seem to work as-is:
+ return values(only(autodiff(Reverse, (x, p) -> sum(f(x, p)), BatchDuplicatedNoNeed, BatchDuplicated(x, ((zero(x) for _ in 1:length(params))...,)), BatchDuplicated(params, onehot(params)))))
- return values(only(autodiff(Forward, (x, p) -> sum(f(x, p)), BatchDuplicatedNoNeed, BatchDuplicated(x, ((zero(x) for _ in 1:length(params))...,)), BatchDuplicated(params, onehot(params)))))
gives
ERROR: Duplicated Returns not yet handled
Stacktrace:
[1] autodiff(::EnzymeCore.ReverseMode, ::var"#11#13", ::Type{BatchDuplicatedNoNeed}, ::BatchDuplicated{Vector{Float64}, 1}, ::Vararg{BatchDuplicated{Vector{Float64}, 1}})
@ Enzyme ~/.julia/packages/Enzyme/DIkTv/src/Enzyme.jl:209
[2] reverse(x::Vector{Float64}, params::Vector{Float64})
@ Main ~/forward_vs_reverse/benchmark.jl:37
[3] top-level scope
@ ~/forward_vs_reverse/benchmark.jl:42
dparams = zero(params)
autodiff(Reverse, (x, p) -> sum(f(x, p)), Active, Duplicated(x, zero(x)), Duplicated(params, dparams))
return dparams
Here since reverse mode gives you the entire derivative in one sweep, you don't need batching. Moreover the other comment that you saw regarding duplicated returns -- note that the return type of this call is changed to active.
That was a misreading from my part. If it's not reading it in place, you also can get away with the following above.
I'll investigate the bug now.
values(only(autodiff(Forward, (x,p) -> sum(f(x, p)), BatchDuplicatedNoNeed, Const(x), BatchDuplicated(params, onehot(params)))))
dparams = zero(params)
autodiff(Reverse, (x, p) -> sum(f(x, p)), Active, Const(x), Duplicated(params, dparams))
return dparams
minimized error. in activity analysis
using Enzyme, ForwardDiff
function f(p, len)
i = 1
x = Vector{Float64}(undef, 1)
while true
tmp = Vector{Float64}(undef, 1)
@inbounds tmp[1] = @inbounds x[1] + p
x = tmp
if i == len
break
end
i += 1
end
return @inbounds x[1]
end
function enzyme_forward(x, params)
return Enzyme.autodiff(Enzyme.Forward, f, DuplicatedNoNeed, Duplicated(1.0, 1.0), Const(2))
end
# Enzyme.API.printall!(true)
# Enzyme.API.printactivity!(true)
@show enzyme_forward([1.0], [1.0, 0.5])
other minimization:
using Enzyme
function f(x, params, len)
i = 1
# x = Vector{Float64}(undef, 1)
while true
tmp = Vector{Float64}(undef, 1)
@inbounds tmp[1] = @inbounds x[1] + @inbounds params[i]
x = tmp
if i == len
break
end
i += 1
end
return @inbounds x[1]
end
function enzyme_forward(x, params)
return Enzyme.autodiff(Enzyme.Forward, f, DuplicatedNoNeed, Const(x), Duplicated(params, [1.0, 0.0]), Const(2))
end
Enzyme.API.printall!(true)
@show enzyme_forward([1.0], [1.0, 0.5])
# @show forward([1.0], [1.0, 0.5])
Hi @wsmoses , I am not sure if it is related, but I also had a Duplicated Returns not yet handled
on this example ( https://github.com/JuliaSmoothOptimizers/OptimizationProblems.jl/blob/main/src/ADNLPProblems/elec.jl ) that might be simpler:
function f(x; n = n)
return sum(
sum(
1 / sqrt((x[j] - x[i])^2 + (x[n + j] - x[n + i])^2 + (x[2n + j] - x[2n + i])^2) for
j = (i + 1):n
) for i = 1:(n - 1)
)
end
x = zeros(600)
g = similar(x)
gradient!(Reverse, g, f, x)
@tmigot that's an unrelated/separate thing where your function is type instable in return. Add a Float64 as the forced return type and retry?
e.g.
function f(x; n = n)::Float64
return sum(
sum(
1 / sqrt((x[j] - x[i])^2 + (x[n + j] - x[n + i])^2 + (x[2n + j] - x[2n + i])^2) for
j = (i + 1):n
) for i = 1:(n - 1)
)
end
x = zeros(600)
g = similar(x)
gradient!(Reverse, g, f, x)
Also I'm going to highly recommend usage of the autodiff interface.
x = ... whatver your input array is, say rand(600)
dx = zeros(600) # this should be zero init'd
Enzyme.autodiff(Reverse, f, x, dx)
Oh, I see. Thanks!
Sorry for the late reply. I feel a bit more comfortable with the syntax so I'm hoping to get to this soon.
By the way, congrats on the defense @wsmoses!
As this now successfully throws a runtime activity error, am going to close (and it succeeds with runtime activity)
@vchuravy I am trying to create an example for the docs as you suggested on my tweet: https://twitter.com/MilesCranmer/status/1643782144941080576?s=20
I was wondering if you could help me with this? Here is what I have so far:
However, this immediately gives the incorrect gradient:
versus
There are no warning or error messages either, it's just incorrect. Any idea why? Does it not like closures?
If I unroll
f
manually and insertx
it works:Cheers, Miles