Open staticfloat opened 5 years ago
@nograd
is not overloading the kwargs version of the function (which has a different type, so it has to be done explicitly). It's easy to fix that, or until then you could just use @adjoint ... _ -> nothing
.
I tried adding this too:
@adjoint NNlib.DenseConvDims(args...; kwargs...) = NNlib.DenseConvDims(args...; kwargs...), nothing
It doesn't seem to help; can you give me a concrete example of what you mean to overload the kwargs version manually?
That looks right to me except that the nothing
should still be a function, i.e. _ -> nothing
. Although if that's not getting called and throwing an error, something else has gone wrong.
Alright, well I changed it to include the function as a second parameter, and now I'm crashing Julia. :)
[ Info: Loading MTG code library...
[ Info: Loading training set...
[ Info: Constructing network...
[ Info: Training network...
Unreachable reached at 0x7fdf32fab245
signal (4): Illegal instruction
in expression starting at /home/sabae/dsrc/Gatherer.jl/bin/train_autoencoder.jl:27
timeit at /home/sabae/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:256 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1864
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
_forward at /home/sabae/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:179
∇conv_data at /home/sabae/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:198 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
unknown function (ip: 0x7fdf32fa4cf2)
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1864
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
ConvTranspose at /home/sabae/.julia/packages/Flux/4HnAR/src/layers/conv.jl:119 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
applychain at /home/sabae/.julia/packages/Flux/4HnAR/src/layers/basic.jl:31 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
Chain at /home/sabae/.julia/packages/Flux/4HnAR/src/layers/basic.jl:33 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
applychain at /home/sabae/.julia/packages/Flux/4HnAR/src/layers/basic.jl:31 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1864
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
applychain at /home/sabae/.julia/packages/Flux/4HnAR/src/layers/basic.jl:31 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
Chain at /home/sabae/.julia/packages/Flux/4HnAR/src/layers/basic.jl:33 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
#15 at /home/sabae/dsrc/Gatherer.jl/src/training.jl:83 [inlined]
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface2.jl:0
_forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface.jl:31 [inlined]
forward at /home/sabae/.julia/dev/Zygote/src/compiler/interface.jl:37 [inlined]
macro expansion at /home/sabae/dsrc/Gatherer.jl/src/training.jl:83 [inlined]
macro expansion at /home/sabae/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
#train_autoencoder#14 at /home/sabae/dsrc/Gatherer.jl/src/training.jl:80
unknown function (ip: 0x7fdf32f4046e)
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1864
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
#train_autoencoder at ./none:0
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1864
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:323
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:411
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:362 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:773
jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:885
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7fdf38270e6f)
unknown function (ip: 0x5)
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:894
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:764
jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:883
jl_load at /buildworker/worker/package_linux64/build/src/toplevel.c:826
include at ./boot.jl:326 [inlined]
include_relative at ./loading.jl:1038
include at ./sysimg.jl:29
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
exec_options at ./client.jl:267
_start at ./client.jl:436
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
unknown function (ip: 0x40191d)
unknown function (ip: 0x401523)
__libc_start_main at /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291
unknown function (ip: 0x4015c4)
Allocations: 142414401 (Pool: 142390894; Big: 23507); GC: 1926
Illegal instruction (core dumped)
julia --project=.. --color=yes -i train_autoencoder.jl 3stack24chan --channels=24 took 00:03:06 [132]
Just so that you don't waste time on this; I'm pretty sure this is because I haven't hooked up gradients for ∇conv_data()
yet; I'm trying that now.
While trying to hook Zygote up with my
NNlib
changes, I’m getting the following error:This is because I have added these constructor calls to DenseConvDims(), but I don’t really want Zygote to do anything with those guys. I want it to ignore them, because these methods don’t touch their input arguments, and are only used to modify what gets called in forward/backward pass, so I don’t think we actually need to differentiate the
DenseConvDims()
constructor itself.I tried adding
@nograd NNlib.DenseConvDims
withinZygote
, but that doesn’t seem to have helped. Does anyone else have any ideas as to what I can do to stopZygote
from looking atDenseConvDims()
at all?