dojo-sim / Dojo.jl

A differentiable physics engine for robotics
MIT License
309 stars 27 forks source link

Both ant_ars and halfcheetah_ars broken #29

Closed GlenHenshaw closed 1 year ago

GlenHenshaw commented 2 years ago

Julia 1.7.2 running on macOS 12.3, Apple M1 architecture. After the training is complete, the examples throw an identical error:

    episode 98 reward_evaluation -0.7817167591651243. Took 36.599132333 seconds
    episode 99 reward_evaluation -13.705568624874973. Took 36.127634708 seconds
    episode 100 reward_evaluation -10.772545859258324. Took 38.790820875 seconds
    rewards = [44.0188651955438, 84.07397185902886, 58.151621089522905, 74.00135312248605, 61.31391504999342]
    mean(train_time_best) = 2776.0249286366666
    std(train_time_best) = 178.33750793333417
    mean(rewards) = 64.31194526331501
    std(rewards) = 15.355529928613961
    WARNING: both Dojo and Base export "close"; uses of it in module Main must be qualified
    ERROR: LoadError: UndefVarError: close not defined
    Stacktrace:
     [1] display_policy(env::Environment{Dojo.Ant, Float64, Mechanism{Float64, 35, 13, 13, 9}, BoxSpace{Float64, 8}, BoxSpace{Float64, 37}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64}; rendering::Bool)
       @ Main ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/algorithms/ars.jl:189
     [2] display_policy(env::Environment{Dojo.Ant, Float64, Mechanism{Float64, 35, 13, 13, 9}, BoxSpace{Float64, 8}, BoxSpace{Float64, 37}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64})
       @ Main ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/algorithms/ars.jl:173
     [3] top-level scope
       @ ~/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/ant_ars.jl:124
     [4] include(fname::String)
       @ Base.MainInclude ./client.jl:451
     [5] top-level scope
       @ REPL[3]:1
    in expression starting at /Users/glenhenshaw/.julia/packages/Dojo/frGnC/examples/reinforcement_learning/ant_ars.jl:124
thowell commented 2 years ago

I made a change to use Base.close. This should fix the issue you are seeing.

GlenHenshaw commented 2 years ago

I just pulled Dojo#main and the problem is still there. Running halfcheetah_ars.jl:

    episode 28 reward_evaluation 5.2598850727605795. Took 32.763520375 seconds
    episode 29 reward_evaluation 6.304627245502632. Took 31.461139625 seconds
    episode 30 reward_evaluation 4.928108782533084. Took 31.937667875 seconds
    rewards = [8.6721774823914, 37.47027757269155, 70.57463206704156, 48.87879789080837, 63.227295557744455]
    mean(train_time_best) = 173.570155853
    std(train_time_best) = 7.095665020032407
    mean(rewards) = 45.76463611413546std(rewards) = 24.366089389557583
    WARNING: both Dojo and Base export "close"; uses of it in module Main must be qualified
    ERROR: LoadError: UndefVarError: close not defined
    Stacktrace:
     [1] display_policy(env::Environment{Dojo.HalfCheetah, Float64, Mechanism{Float64, 23, 7, 7, 9}, BoxSpace{Float64, 6}, BoxSpace{Float64, 18}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64}; rendering::Bool)
       @ Main ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/algorithms/ars.jl:189
     [2] display_policy(env::Environment{Dojo.HalfCheetah, Float64, Mechanism{Float64, 23, 7, 7, 9}, BoxSpace{Float64, 6}, BoxSpace{Float64, 18}, Nothing}, policy::Policy{Float64}, normalizer::Normalizer{Float64}, hp::HyperParameters{Float64})
       @ Main ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/algorithms/ars.jl:173
     [3] top-level scope
       @ ~/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/halfcheetah_ars.jl:84
     [4] include(fname::String)
       @ Base.MainInclude ./client.jl:451
     [5] top-level scope
       @ REPL[4]:1
    in expression starting at /Users/glenhenshaw/.julia/packages/Dojo/6iIp0/examples/reinforcement_learning/halfcheetah_ars.jl:84
janbruedigam commented 1 year ago

Ant ARS should work again, although the hyperparameters might need some tuning for it to walk properly. The halfcheetah example has been removed, but the mechanism still exists, so people could create the example themselves.