JuliaReinforcementLearning / GridWorlds.jl

Help! I'm lost in the flatland!
MIT License
47 stars 9 forks source link

RLBase.reward(env) is not defined #130

Closed almostintuitive closed 3 years ago

almostintuitive commented 3 years ago

Hi!

I'm working on recreating some basic RL policies, and trying to use the common interface of Environment and it looks like RLBase.reward is not implemented for EmptyRoom.

image

I can open a pull request, but wanted to ask if it was intentional or not!

Thanks!

findmyway commented 3 years ago

Hi @itchingpixels ,

Thanks for reporting. Contribution is always welcome.

I guess it's caused by that the default implementation of RLBase.reward(env::AbstractEnv)=env.reward is removed sometime ago. But it seems there's reward(env) in the test case. @Sid-Bhatia-0 Could you confirm the CI test are still working?

Sid-Bhatia-0 commented 3 years ago

Hello @itchingpixels

Thank you for reaching out. The screenshot that you have attached ( image ) seems to be from the ReinforcementLearningBase package at here

From what I understand, the file CommonRLInterface.jl that you are referring to is for illustrative purposes. The code there is meant to explain how one can convert environments between the RLBase interface and the CommonRLInterface (from CommonRLInterface.jl package). These are two different interfaces. There isn't a universal conversion between these two interfaces, at least not as of this writing. Thus, In general, this conversion will need to be customized based on the environment you want to convert. For more information about the two interfaces, check out this discussion.

GridWorlds.jl uses the RLBase interface. If you want to write algorithms in the CommonRLInterface, then you must write your own conversion. If you want, you can also check out ReinforcementLearingZoo.jl that already has examples of algorithms using the RLBase interface (so you don't have to do any conversion since GridWorlds.jl uses the RLBase interface). Hope this clarifies things 😄

Here is an end-to-end example usage of GridWorlds.jl. As you can see towards the end, RLBase.reward(env) works just fine and uses this definition.

sid projects $ mkdir demo-GridWorlds
sid projects $ cd demo-GridWorlds/
sid demo-GridWorlds $ julia --project=.
┌ Warning: no Manifest.toml file found, static paths used
└ @ Revise ~/.julia/packages/Revise/nWJXk/src/packagedef.jl:1337
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.0 (2020-08-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(demo-GridWorlds) pkg> st
Status `~/acads/projects/demo-GridWorlds/Project.toml` (empty project)

(demo-GridWorlds) pkg> add GridWorlds
   Updating registry at `~/.julia/registries/General`
   Updating git-repo `https://github.com/JuliaRegistries/General`
  Resolving package versions...
  Installed OrderedCollections ─ v1.4.0
  Installed GridWorlds ───────── v0.3.1
Updating `~/acads/projects/demo-GridWorlds/Project.toml`
  [e15a9946] + GridWorlds v0.3.1
Updating `~/acads/projects/demo-GridWorlds/Manifest.toml`
  [1520ce14] + AbstractTrees v0.3.3
  [d842c3ba] + CommonRLInterface v0.3.1
  [34da2185] + Compat v3.25.0
  [a8cc5b0e] + Crayons v4.0.4
  [864edb3b] + DataStructures v0.18.9
  [e15a9946] + GridWorlds v0.3.1
  [1914dd2f] + MacroTools v0.5.6
  [bac558e1] + OrderedCollections v1.4.0
  [e575027e] + ReinforcementLearningBase v0.9.5
  [ae029012] + Requires v1.1.2
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [8bb1440f] + DelimitedFiles
  [8ba89e20] + Distributed
  [b77e0a4c] + InteractiveUtils
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [a63ad114] + Mmap
  [44cfe95a] + Pkg
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA
  [9e88b42a] + Serialization
  [1a1011a3] + SharedArrays
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode

(demo-GridWorlds) pkg> add ReinforcementLearningBase
  Resolving package versions...
Updating `~/acads/projects/demo-GridWorlds/Project.toml`
  [e575027e] + ReinforcementLearningBase v0.9.5
No Changes to `~/acads/projects/demo-GridWorlds/Manifest.toml`

julia> import GridWorlds
[ Info: Precompiling GridWorlds [e15a9946-cd7f-4d03-83e2-6c30bacb0043]

julia> import ReinforcementLearningBase

julia> env = GridWorlds.EmptyRoom()
Global view:
████████
█⋅⋅⋅⋅←⋅█
█⋅⋅⋅⋅⋅⋅█
█♥⋅⋅⋅⋅⋅█
█⋅⋅⋅⋅⋅⋅█
█⋅⋅⋅⋅⋅⋅█
█⋅⋅⋅⋅⋅⋅█
████████

Local view:
~█↓⋅⋅
~█⋅⋅⋅
~█⋅⋅⋅
~█⋅⋅⋅
~█⋅⋅♥

RLBase.reward(env): 0.0
RLBase.is_terminated(env): false

julia> import ReinforcementLearningBase: RLBase

julia> RLBase.reward(env)
0.0

julia> import Debugger

julia> Debugger.@enter RLBase.reward(env)
In reward(env, #unused#) at /home/sid/.julia/packages/GridWorlds/eq4Eg/src/abstract_grid_world.jl:82
>82  RLBase.reward(env::AbstractGridWorld, ::RLBase.DefaultPlayer) = env.reward

About to run: <(getproperty)(GridWorlds.EmptyRoom{Random._GLOBAL_RNG}(Bool[0 0 0 0 0 0 0 0; 1 1 1 1 1 1 1 1; 0 0 0 0...>
1|debug> 
almostintuitive commented 3 years ago

thanks a lot for the detailed reponse, I'll look into this today!