sdobber / FluxArchitectures.jl

Complex neural network examples for Flux.jl
MIT License
122 stars 15 forks source link

How to run DARNN on GPU #12

Closed jookty closed 3 years ago

jookty commented 3 years ago

I have tried to run DARNN model on GPU, But some error has occurred. I modified some code below ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ encoder_lstm = Seq(HiddenRecur(Flux.LSTMCell(inp, encodersize) |> gpu)) decoder_lstm = Seq(HiddenRecur(Flux.LSTMCell(1, decodersize) |> gpu ))

= @inbounds =# for t in 1:m.poollength # for gpu

input = input |> gpu target = target |> gpu model = model |> gpu +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Error message is

Warning: haskey(::TargetIterator, name::String) is deprecated, use Target(; name = name) !== nothing instead. │ caller = llvm_compat(::VersionNumber) at compatibility.jl:176 └ @ CUDAnative ~/.julia/packages/CUDAnative/ierw8/src/compatibility.jl:176 [ Info: CUDA is on [ Info: device = gpu [ Info: Training [ Info: epochs = 1 ERROR: LoadError: MethodError: no method matching (::var"#loss#30")(::CuArrays.CuArray{Float32,4,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}) Stacktrace: [1] macro expansion at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0 [inlined] [2] _pullback(::Zygote.Context, ::var"#loss#30", ::CuArrays.CuArray{Float32,4,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}) at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:13 [3] loss at /home/jee/Projects/Julia/learn/FluxArch/DARNN/e2.jl:82 [inlined] [4] _pullback(::Zygote.Context, ::var"#loss#30") at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0 [5] pullback(::Function, ::Zygote.Params) at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface.jl:172 [6] gradient(::Function, ::Zygote.Params) at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface.jl:53 [7] train2() at /home/jee/Projects/Julia/learn/FluxArch/DARNN/e2.jl:85 [8] top-level scope at /home/jee/Projects/Julia/learn/FluxArch/DARNN/e2.jl:92 [9] include(::Module, ::String) at ./Base.jl:377 [10] exec_options(::Base.JLOptions) at ./client.jl:288 [11] _start() at ./client.jl:484

Would you to help me to fix the model?

sdobber commented 3 years ago

I would very much like to see these models run on a GPU, but unfortunately I don't have access to a suitable one currently, so I cannot do any development in that direction myself.

When I remember correctly, Flux's RNNs and their CUDA-implementations are a weak point in the Flux ecosystem - for example, you cannot change the activation function in a GRU to something different than tanh, as this is the only combination CUDA supports (which is why propably LSTnet would not work on a GPU with it's relu-GRU-cell). Also, I would expect the Seq(...)-part in the DARNN to give problems, as I am not sure if its heavy use of Zygote.Buffer is actually compatible with calculations on the GPU. I also cannot guarantee that all model parameters get transferred to the GPU correctly when using |> gpu with the nested structs of the models - I guess this would be my starting point for an investigation.

jookty commented 3 years ago

Thank you for your kind reply.