Closed ludoro closed 4 years ago
Mh, interesting question. I am not familiar with Surrogates.jl
, but it appears it's using Zygote
. Have you tried just computing the derivative with PolyChaos
und Zygote
?
In any case, evaluate()
evaluates the orthogonal polynomial based on the three-term recurrence formula. In principle, I don't think this should pose any difficulties for Zygote
.
I'm happy to try it out myself, but I am currently on paternal leave with almost all of my time taken up by the younglings.. ;)
I gave it a very quick try:
julia> using PolyChaos, Zygote
julia> op = GaussOrthoPoly(3)
GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(3, [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 2.0, 3.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 3, [-1.732050807568877, -2.6908000062985024e-16, 1.7320508075688774], [0.16666666666666677, 0.6666666666666663, 0.16666666666666657]))
julia> showbasis(op)
1
x
x^2 - 1.0
x^3 - 3.0x
julia> gradient(x -> evaluate(2, x, op), 1)
(2.0,)
julia> gradient(x -> evaluate(3, x, op), 2)[1] == 3*2^2 - 3
true
That's good!
Thanks a lot for your super fast answer.
Yep, I have no problem with in 1D. However I am struggling with replicating this taking the gradient with respect to a multivariable input, using MultiOrthoPoly.
Say for example:
op1 = GaussOrthoPoly(3)
op2 = GaussOrthoPoly(2)
op = [op1,op2]
mop = MultiOrthoPoly(op,2)
gradient(x->evaluate(x,mop)[1],collect([1.0,1.0]))
This gives me error:
ERROR: Mutating arrays is not supported
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] (::Zygote.var"#459#460")(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/lib/array.jl:67
[3] (::Zygote.var"#1009#back#461"{Zygote.var"#459#460"})(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49
[4] Array at ./array.jl:541 [inlined]
[5] (::typeof(∂(Array{Float64,2})))(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[6] Array at ./boot.jl:427 [inlined]
[7] (::typeof(∂(Array{T,2} where T)))(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[8] |> at ./operators.jl:823 [inlined]
[9] (::typeof(∂(|>)))(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[10] evaluate at /Users/ludovicobessi/.julia/packages/PolyChaos/Dhyx4/src/evaluate.jl:115 [inlined]
[11] (::typeof(∂(evaluate)))(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[12] (::Zygote.var"#175#176"{typeof(∂(evaluate)),Tuple{Tuple{Nothing,Nothing},Tuple{Nothing,Nothing}}})(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/lib/lib.jl:182
[13] (::Zygote.var"#347#back#177"{Zygote.var"#175#176"{typeof(∂(evaluate)),Tuple{Tuple{Nothing,Nothing},Tuple{Nothing,Nothing}}}})(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49
[14] evaluate at /Users/ludovicobessi/.julia/packages/PolyChaos/Dhyx4/src/evaluate.jl:118 [inlined]
[15] (::typeof(∂(evaluate)))(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[16] evaluate at /Users/ludovicobessi/.julia/packages/PolyChaos/Dhyx4/src/evaluate.jl:123 [inlined]
[17] (::typeof(∂(evaluate)))(::Array{Float64,2}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[18] #44 at ./REPL[114]:1 [inlined]
[19] (::typeof(∂(#44)))(::Float64) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[20] (::Zygote.var"#37#38"{typeof(∂(#44))})(::Float64) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface.jl:45
[21] gradient(::Function, ::Array{Float64,1}) at /Users/ludovicobessi/.julia/packages/Zygote/1GXzF/src/compiler/interface.jl:54
[22] top-level scope at REPL[114]:1
This will do the job:
julia> op1 = GaussOrthoPoly(3)
GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(3, [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 2.0, 3.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 3, [-1.732050807568877, -2.6908000062985024e-16, 1.7320508075688774], [0.16666666666666677, 0.6666666666666663, 0.16666666666666657]))
julia> op2 = GaussOrthoPoly(2)
GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(2, [0.0, 0.0, 0.0], [1.0, 1.0, 2.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 2, [-0.9999999999999998, 0.9999999999999998], [0.4999999999999999, 0.4999999999999999]))
julia> op = [op1,op2]
2-element Array{GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}},1}:
GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(3, [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 2.0, 3.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 3, [-1.732050807568877, -2.6908000062985024e-16, 1.7320508075688774], [0.16666666666666677, 0.6666666666666663, 0.16666666666666657]))
GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(2, [0.0, 0.0, 0.0], [1.0, 1.0, 2.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 2, [-0.9999999999999998, 0.9999999999999998], [0.4999999999999999, 0.4999999999999999]))
julia> mop = MultiOrthoPoly(op,2)
MultiOrthoPoly{ProductMeasure,Quad{Float64,Array{Float64,1}},Array{GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}},1}}(["GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}", "GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}"], 2, 6, [0 0; 1 0; … ; 1 1; 0 2], ProductMeasure(PolyChaos.var"#w#39"{Array{GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}},1}}(GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}[GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(3, [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 2.0, 3.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 3, [-1.732050807568877, -2.6908000062985024e-16, 1.7320508075688774], [0.16666666666666677, 0.6666666666666663, 0.16666666666666657])), GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(2, [0.0, 0.0, 0.0], [1.0, 1.0, 2.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 2, [-0.9999999999999998, 0.9999999999999998], [0.4999999999999999, 0.4999999999999999]))]), GaussMeasure[GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true)]), GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}[GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(3, [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 2.0, 3.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 3, [-1.732050807568877, -2.6908000062985024e-16, 1.7320508075688774], [0.16666666666666677, 0.6666666666666663, 0.16666666666666657])), GaussOrthoPoly{Array{Float64,1},GaussMeasure,Quad{Float64,Array{Float64,1}}}(2, [0.0, 0.0, 0.0], [1.0, 1.0, 2.0], GaussMeasure(PolyChaos.w_gaussian, (-Inf, Inf), true), Quad{Float64,Array{Float64,1}}("golubwelsch", 2, [-0.9999999999999998, 0.9999999999999998], [0.4999999999999999, 0.4999999999999999]))])
julia> gradient(x -> first(evaluate([1, 2], x, mop)), [2, 3])
([8.0, 12.0],)
Do you see the hackiness in the solution? The problem is that x -> evaluate([1, 2], x, mop)
returns a 1-element Array{Float64,1}
, hence the first
.
In general, your problem appears to be that you call the overloaded function evaluate(x, mop)
, which evaluates all basis polynomials at x
, hence not returning a scalar; but Zygote's gradient
needs a scalar-valued output.
Oh wow, that was dumb. Long day, sorry! By the way, really cool library! Enjoy your paternal leave :)
Happy to help! :)
..and happy to see PolyChaos being used actively!
I'll definitely checkout Surrogates.jl
!
Hello everyone,
I am using PolyChaos.jl to build a new Surrogates at Surrogates.jl. Every surrogate is Automatic-differentiable at the moment, so I want to my PolynomialChaosSurrogate to be differentiable as well. However, I think evaluate() is not automatic differentiable, see this example:
PolyChaos.evaluate(collect(val), pcND.ortopolys)
Where val is a tuple like:
val = (1.0,2.0,3.0)
I get the following mistake: