baggepinnen / FluxOptTools.jl

Use Optim to train Flux models and visualize loss landscapes
MIT License
59 stars 4 forks source link

on hybridizing Zygote and Optim #3

Open vavrines opened 4 years ago

vavrines commented 4 years ago

Hi Bagge,

I tried to use Zygote.jl with BFGS in Optim.jl, and the simplified codes write as push!(LOAD_PATH, "./FluxOptTools.jl/src/") # from FluxOptTools.jl on GitHub using Flux, FluxOptTools, Zygote, Optim, Random

m = Chain(Dense(1, 20, tanh), Dense(20, 1))
X = (range(0., 1., length=100) |> collect)'

u(x) = m(x)
ux(x) = Zygote.pullback(u, x)[2](ones(size(x)))[1]

#resi(x) = u(x) # works
resi(x) = ux(x) # doesn't work
loss() = sum(abs2, resi(X))

Zygote.refresh()
ps = Flux.params(m)

lossfun, gradfun, fg!, p0 = FluxOptTools.optfuns(loss, ps)
res = Optim.optimize(Optim.only_fg!(fg!), 
                     p0,
                     BFGS(),
                     Optim.Options(iterations=500, store_trace=true))

I passed the optimization when using neural network itself, but failed while using its derivatives. The errors reported are as follows,

Internal error: encountered unexpected error in runtime:
BoundsError(a=Array{Any, (4,)}[
  Core.Compiler.VarState(typ=Zygote.Pullback{Tuple{typeof(Base.get), Base.Dict{GlobalRef, Any}, GlobalRef, Nothing}, Any}, undef=false),
  Core.Compiler.VarState(typ=Core.Compiler.Const(val=nothing, actual=false), undef=false),
  Core.Compiler.VarState(typ=Core.Compiler.Const(val=nothing, actual=false), undef=false),
  Core.Compiler.VarState(typ=Core.Compiler.Const(val=nothing, actual=false), undef=false)], i=(5,))
rec_backtrace at /buildworker/worker/package_linux64/build/src/stackwalk.c:94
record_backtrace at /buildworker/worker/package_linux64/build/src/task.c:219 [inlined]
jl_throw at /buildworker/worker/package_linux64/build/src/task.c:429
jl_bounds_error_ints at /buildworker/worker/package_linux64/build/src/rtutils.c:183
setindex! at ./essentials.jl:455 [inlined]
stupdate! at ./compiler/typelattice.jl:243
typeinf_local at ./compiler/abstractinterpretation.jl:1203
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1230
typeinf at ./compiler/typeinfer.jl:12
typeinf_ext at ./compiler/typeinfer.jl:568
typeinf_ext at ./compiler/typeinfer.jl:599
jfptr_typeinf_ext_1.clone_1 at /home/tianbai/Software/julia-1.2.0/lib/julia/sys.so (unknown line)
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1614 [inlined]
jl_type_infer at /buildworker/worker/package_linux64/build/src/gf.c:207
jl_compile_method_internal at /buildworker/worker/package_linux64/build/src/gf.c:1773
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2196
accum_global at /home/tianbai/.julia/packages/Zygote/8dVxG/src/lib/lib.jl:54 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
#103 at /home/tianbai/.julia/packages/Zygote/8dVxG/src/lib/lib.jl:67 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
#174#back at /home/tianbai/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
gradtuple1 at /home/tianbai/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:12 [inlined]
#186#back at /home/tianbai/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
#28 at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface.jl:38 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
ux at ./In[3]:8 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
loss at ./In[3]:11 [inlined]
Pullback at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
#38 at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface.jl:101
unknown function (ip: 0x7fe0d35e70b7)
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2197
gradient at /home/tianbai/.julia/packages/Zygote/8dVxG/src/compiler/interface.jl:47
optfuns at /home/tianbai/OneDrive/Programming/Neural network/julia/FluxOptTools.jl/src/FluxOptTools.jl:62
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2197
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:323
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:411
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:362 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:772
jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:884
Interpreter frame (ip: 1)
Core.CodeInfo(code=Array{Any, (18,)}[
  Expr(:call, Base.getproperty, :FluxOptTools, :(:optfuns)),
  Expr(:call, SSAValue(1), :loss, :ps),
  Expr(:call, Base.indexed_iterate, SSAValue(2), 1),
  Expr(:call, Core.getfield, SSAValue(3), 1),
  :lossfun = SSAValue(4),
  Core.SlotNumber(id=1) = Expr(:call, Core.getfield, SSAValue(3), 2),
  Expr(:call, Base.indexed_iterate, SSAValue(2), 2, Core.SlotNumber(id=1)),
  Expr(:call, Core.getfield, SSAValue(7), 1),
  :gradfun = SSAValue(8),
  Core.SlotNumber(id=1) = Expr(:call, Core.getfield, SSAValue(7), 2),
  Expr(:call, Base.indexed_iterate, SSAValue(2), 3, Core.SlotNumber(id=1)),
  Expr(:call, Core.getfield, SSAValue(11), 1),
  :fg! = SSAValue(12),
  Core.SlotNumber(id=1) = Expr(:call, Core.getfield, SSAValue(11), 2),
  Expr(:call, Base.indexed_iterate, SSAValue(2), 4, Core.SlotNumber(id=1)),
  Expr(:call, Core.getfield, SSAValue(15), 1),
  :p0 = SSAValue(16),
  Expr(:return, SSAValue(2))], codelocs=Array{Int32, (18,)}[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], ssavaluetypes=18, ssaflags=Array{UInt8, (0,)}[], method_for_inference_limit_heuristics=nothing, linetable=Array{Any, (1,)}[Core.LineInfoNode(method=Symbol("top-level scope"), file=Symbol("In[3]"), line=16, inlined_at=0)], slotnames=Array{Symbol, (1,)}[Symbol("#s48")], slotflags=Array{UInt8, (1,)}[0x08], slottypes=nothing, rettype=Any, parent=nothing, min_world=1, max_world=-1, inferred=false, inlineable=false, propagate_inbounds=false, pure=false)jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:893
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:815
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:844
eval at ./boot.jl:330 [inlined]
softscope_include_string at /home/tianbai/.julia/packages/SoftGlobalScope/cSbw5/src/SoftGlobalScope.jl:218
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191
execute_request at /home/tianbai/.julia/packages/IJulia/F1GUo/src/execute_request.jl:67
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2197
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1614 [inlined]
jl_f__apply at /buildworker/worker/package_linux64/build/src/builtins.c:563
jl_f__apply_latest at /buildworker/worker/package_linux64/build/src/builtins.c:601
#invokelatest#1 at ./essentials.jl:790 [inlined]
invokelatest at ./essentials.jl:789 [inlined]
eventloop at /home/tianbai/.julia/packages/IJulia/F1GUo/src/eventloop.jl:8
#15 at ./task.jl:268
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2197
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1614 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:596
unknown function (ip: 0xffffffffffffffff)
MethodError: no method matching length(::Zygote.Context)
Closest candidates are:
  length(!Matched::Core.SimpleVector) at essentials.jl:597
  length(!Matched::Base.MethodList) at reflection.jl:819
  length(!Matched::Core.MethodTable) at reflection.jl:893
  ...

Stacktrace:
 [1] (::getfield(FluxOptTools, Symbol("##1#2")))(::Pair{Any,Any}) at ./none:0
 [2] iterate at ./generator.jl:47 [inlined]
 [3] mapfoldl_impl(::Function, ::Function, ::NamedTuple{(),Tuple{}}, ::Base.Generator{IdDict{Any,Any},getfield(FluxOptTools, Symbol("##1#2"))}) at ./reduce.jl:55
 [4] #mapfoldl#188 at ./reduce.jl:72 [inlined]
 [5] mapfoldl at ./reduce.jl:72 [inlined]
 [6] #mapreduce#192 at ./reduce.jl:208 [inlined]
 [7] mapreduce at ./reduce.jl:208 [inlined]
 [8] sum at ./reduce.jl:403 [inlined]
 [9] sum at ./reduce.jl:420 [inlined]
 [10] veclength at /home/tianbai/OneDrive/Programming/Neural network/julia/FluxOptTools.jl/src/FluxOptTools.jl:8 [inlined]
 [11] copyto!(::Array{Float64,1}, ::Zygote.Grads) at /home/tianbai/OneDrive/Programming/Neural network/julia/FluxOptTools.jl/src/FluxOptTools.jl:17
 [12] (::getfield(FluxOptTools, Symbol("##5#8")){typeof(loss),Params})(::Float64, ::Array{Float64,1}, ::Array{Float64,1}) at /home/tianbai/OneDrive/Programming/Neural network/julia/FluxOptTools.jl/src/FluxOptTools.jl:78
 [13] (::getfield(NLSolversBase, Symbol("##61#62")){NLSolversBase.InplaceObjective{Nothing,getfield(FluxOptTools, Symbol("##5#8")){typeof(loss),Params},Nothing,Nothing,Nothing},Float64})(::Array{Float64,1}, ::Array{Float64,1}) at /home/tianbai/.julia/packages/NLSolversBase/NsXIC/src/objective_types/incomplete.jl:45
 [14] value_gradient!!(::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at /home/tianbai/.julia/packages/NLSolversBase/NsXIC/src/interface.jl:82
 [15] initial_state(::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at /home/tianbai/.julia/packages/Optim/EhyUl/src/multivariate/solvers/first_order/bfgs.jl:66
 [16] optimize at /home/tianbai/.julia/packages/Optim/EhyUl/src/multivariate/optimize/optimize.jl:33 [inlined]
 [17] #optimize#93(::Bool, ::Symbol, ::typeof(optimize), ::NLSolversBase.InplaceObjective{Nothing,getfield(FluxOptTools, Symbol("##5#8")){typeof(loss),Params},Nothing,Nothing,Nothing}, ::Array{Float64,1}, ::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}) at /home/tianbai/.julia/packages/Optim/EhyUl/src/multivariate/optimize/interface.jl:116
 [18] optimize(::NLSolversBase.InplaceObjective{Nothing,getfield(FluxOptTools, Symbol("##5#8")){typeof(loss),Params},Nothing,Nothing,Nothing}, ::Array{Float64,1}, ::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}) at /home/tianbai/.julia/packages/Optim/EhyUl/src/multivariate/optimize/interface.jl:115
 [19] top-level scope at In[3]:18

Do you have got any advice on how to fix it? Thanks : )

baggepinnen commented 4 years ago

Hi!

I believe this might be related to https://github.com/FluxML/Zygote.jl/issues/385 I don't think it's related to this package but rather the AD. Zygote is still very immature and anything but toy programs are likely to present some challenges when differentiating, especially for second order differentiation, where also the adjoint definitions need to be differentiable.