JuliaReinforcementLearning / ReinforcementLearning.jl

A reinforcement learning package for Julia
https://juliareinforcementlearning.org
Other
590 stars 112 forks source link

"params not defined," "JuliaRL_BasicDQN_CartPole" #778

Closed RohanRajagopal closed 1 year ago

RohanRajagopal commented 1 year ago

Running the first experiment, "JuliaRL_BasicDQN_CartPole." On CPU

ex = E`JuliaRL_BasicDQN_CartPole`
run(ex)
plot(ex.hook.rewards)

And get the following error,

BasicDQN <-> CartPole
  ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
Progress:   0%|█                                                                                                                                                                                                                   |  ETA: 14:28:25ERROR: UndefVarError: params not defined
Stacktrace:
  [1] update!(learner::WARNING: both Losses and NNlib export "ctc_loss"; uses of it in module Flux must be qualified
BasicDQNLearner{NeuralNetworkApproximator{Chain{Tuple{Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, Adam}, typeof(huber_loss), StableRNGs.LehmerRNG}, batch::NamedTuple{(:state, :action, :reward, :terminal, :next_state), Tuple{Matrix{Float32}, Vector{Int64}, Vector{Float32}, Vector{Bool}, Matrix{Float32}}})
    @ ReinforcementLearningZoo C:\Users\\.julia\packages\ReinforcementLearningZoo\tvfq9\src\algorithms\dqns\basic_dqn.jl:78
  [2] update!(learner::BasicDQNLearner{NeuralNetworkApproximator{Chain{Tuple{Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, Adam}, typeof(huber_loss), StableRNGs.LehmerRNG}, traj::CircularArraySARTTrajectory{NamedTuple{(:state, :action, :reward, :terminal), Tuple{CircularArrayBuffers.CircularArrayBuffer{Float32, 2, Matrix{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Int64, Vector{Int64}}, CircularArrayBuffers.CircularVectorBuffer{Float32, Vector{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Bool, Vector{Bool}}}}})
    @ ReinforcementLearningZoo C:\Users\\.julia\packages\ReinforcementLearningZoo\tvfq9\src\algorithms\dqns\basic_dqn.jl:65
  [3] update!
    @ C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\policies\q_based_policies\learners\abstract_learner.jl:35 [inlined]
  [4] update!
    @ C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\policies\q_based_policies\q_based_policy.jl:67 [inlined]
  [5] (::Agent{QBasedPolicy{BasicDQNLearner{NeuralNetworkApproximator{Chain{Tuple{Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, Adam}, typeof(huber_loss), StableRNGs.LehmerRNG}, EpsilonGreedyExplorer{:exp, false, StableRNGs.LehmerRNG}}, CircularArraySARTTrajectory{NamedTuple{(:state, :action, :reward, :terminal), Tuple{CircularArrayBuffers.CircularArrayBuffer{Float32, 2, Matrix{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Int64, Vector{Int64}}, CircularArrayBuffers.CircularVectorBuffer{Float32, Vector{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Bool, Vector{Bool}}}}}})(stage::PreActStage, env::CartPoleEnv{Base.OneTo{Int64}, Float32, Int64, StableRNGs.LehmerRNG}, action::Int64)
    @ ReinforcementLearningCore C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\policies\agents\agent.jl:78
  [6] _run(policy::Agent{QBasedPolicy{BasicDQNLearner{NeuralNetworkApproximator{Chain{Tuple{Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, Adam}, typeof(huber_loss), StableRNGs.LehmerRNG}, EpsilonGreedyExplorer{:exp, false, StableRNGs.LehmerRNG}}, CircularArraySARTTrajectory{NamedTuple{(:state, :action, :reward, :terminal), Tuple{CircularArrayBuffers.CircularArrayBuffer{Float32, 2, Matrix{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Int64, Vector{Int64}}, CircularArrayBuffers.CircularVectorBuffer{Float32, Vector{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Bool, Vector{Bool}}}}}}, env::CartPoleEnv{Base.OneTo{Int64}, Float32, Int64, StableRNGs.LehmerRNG}, stop_condition::StopAfterStep{ProgressMeter.Progress}, hook::TotalRewardPerEpisode)
    @ ReinforcementLearningCore C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\core\run.jl:29
  [7] run(policy::Agent{QBasedPolicy{BasicDQNLearner{NeuralNetworkApproximator{Chain{Tuple{Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}, Adam}, typeof(huber_loss), StableRNGs.LehmerRNG}, EpsilonGreedyExplorer{:exp, false, StableRNGs.LehmerRNG}}, CircularArraySARTTrajectory{NamedTuple{(:state, :action, :reward, :terminal), Tuple{CircularArrayBuffers.CircularArrayBuffer{Float32, 2, Matrix{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Int64, Vector{Int64}}, CircularArrayBuffers.CircularVectorBuffer{Float32, Vector{Float32}}, CircularArrayBuffers.CircularVectorBuffer{Bool, Vector{Bool}}}}}}, env::CartPoleEnv{Base.OneTo{Int64}, Float32, Int64, StableRNGs.LehmerRNG}, stop_condition::StopAfterStep{ProgressMeter.Progress}, hook::TotalRewardPerEpisode)
    @ ReinforcementLearningCore C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\core\run.jl:10
  [8] run(x::Experiment; describe::Bool)
    @ ReinforcementLearningCore C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\core\experiment.jl:56
  [9] run(x::Experiment)
    @ ReinforcementLearningCore C:\Users\\.julia\packages\ReinforcementLearningCore\yeRLW\src\core\experiment.jl:54
 [10] top-level scope
    @ c:\Users\\Downloads\rl_julia.jl:52

Any help is deeply appreciated. Thanks very much.

kir0ul commented 1 year ago

It works for me with Julia 1.8.3 (also with Julia 1.8.4) and installing ReinforcementLearningExperiments v0.1.4, which pulls:

RohanRajagopal commented 1 year ago

I tried with deleting the Environment, restarting Julia from fresh, and now it works. I tried it before, and that didn't do it. I probably missed something previously. I should have been more careful.

Apologies, and Thanks very much.

willclarktech commented 1 year ago

I'm hitting the same error and removing/reinstalling the environment doesn't fix it. It works fine if I use the package ReinforcementLearningExperiments (v0.1.4), but if I just use ReinforcementLearning directly and copy/paste the example I get the error.

Edit: Looks like ctc_loss was moved from Flux to NNlib in Flux v0.13.5 and NNlib v0.8.9:

So pinning NNlib to v0.8.8 fixes the issue for now. I'm guessing that function just doesn't exist in Flux v0.12.

nickkeepfer commented 1 year ago

This error still persists, I don't believe it was fixed with the update to v0.10.2.

This is even with Flux back to v0.13.4 from v0.14.5 and CUDA back to v3.13.1 from vV5.0.0, which are already substantially downgraded.

ERROR: LoadError: UndefVarError: `params` not defined
Stacktrace:
 [1] copyto!(dest::WARNING: both Losses and NNlib export "ctc_loss"; uses of it in module Flux must be qualified
NeuralNetworkApproximator{...}
(@v1.9) pkg> st
Status `~/.julia/environments/v1.9/Project.toml`
⌅ [052768ef] CUDA v3.13.1
⌅ [587475ba] Flux v0.13.4
  [158674fc] ReinforcementLearning v0.10.2
  [860ef19b] StableRNGs v1.0.0