Closed almostintuitive closed 3 years ago
Hi @itchingpixels ,
Thanks for reporting. Contribution is always welcome.
I guess it's caused by that the default implementation of RLBase.reward(env::AbstractEnv)=env.reward
is removed sometime ago. But it seems there's reward(env)
in the test case. @Sid-Bhatia-0 Could you confirm the CI test are still working?
Hello @itchingpixels
Thank you for reaching out. The screenshot that you have attached ( ) seems to be from the ReinforcementLearningBase package at here
From what I understand, the file CommonRLInterface.jl
that you are referring to is for illustrative purposes. The code there is meant to explain how one can convert environments between the RLBase
interface and the CommonRLInterface
(from CommonRLInterface.jl package). These are two different interfaces. There isn't a universal conversion between these two interfaces, at least not as of this writing. Thus, In general, this conversion will need to be customized based on the environment you want to convert. For more information about the two interfaces, check out this discussion.
GridWorlds.jl
uses the RLBase
interface. If you want to write algorithms in the CommonRLInterface
, then you must write your own conversion. If you want, you can also check out ReinforcementLearingZoo.jl
that already has examples of algorithms using the RLBase
interface (so you don't have to do any conversion since GridWorlds.jl
uses the RLBase
interface). Hope this clarifies things 😄
Here is an end-to-end example usage of GridWorlds.jl
. As you can see towards the end, RLBase.reward(env)
works just fine and uses this definition.
sid projects $ mkdir demo-GridWorlds
sid projects $ cd demo-GridWorlds/
sid demo-GridWorlds $ julia --project=.
┌ Warning: no Manifest.toml file found, static paths used
└ @ Revise ~/.julia/packages/Revise/nWJXk/src/packagedef.jl:1337
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.5.0 (2020-08-01)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
(demo-GridWorlds) pkg> st
Status `~/acads/projects/demo-GridWorlds/Project.toml` (empty project)
(demo-GridWorlds) pkg> add GridWorlds
Updating registry at `~/.julia/registries/General`
Updating git-repo `https://github.com/JuliaRegistries/General`
Resolving package versions...
Installed OrderedCollections ─ v1.4.0
Installed GridWorlds ───────── v0.3.1
Updating `~/acads/projects/demo-GridWorlds/Project.toml`
[e15a9946] + GridWorlds v0.3.1
Updating `~/acads/projects/demo-GridWorlds/Manifest.toml`
[1520ce14] + AbstractTrees v0.3.3
[d842c3ba] + CommonRLInterface v0.3.1
[34da2185] + Compat v3.25.0
[a8cc5b0e] + Crayons v4.0.4
[864edb3b] + DataStructures v0.18.9
[e15a9946] + GridWorlds v0.3.1
[1914dd2f] + MacroTools v0.5.6
[bac558e1] + OrderedCollections v1.4.0
[e575027e] + ReinforcementLearningBase v0.9.5
[ae029012] + Requires v1.1.2
[2a0f44e3] + Base64
[ade2ca70] + Dates
[8bb1440f] + DelimitedFiles
[8ba89e20] + Distributed
[b77e0a4c] + InteractiveUtils
[76f85450] + LibGit2
[8f399da3] + Libdl
[37e2e46d] + LinearAlgebra
[56ddb016] + Logging
[d6f4376e] + Markdown
[a63ad114] + Mmap
[44cfe95a] + Pkg
[de0858da] + Printf
[3fa0cd96] + REPL
[9a3f8284] + Random
[ea8e919c] + SHA
[9e88b42a] + Serialization
[1a1011a3] + SharedArrays
[6462fe0b] + Sockets
[2f01184e] + SparseArrays
[10745b16] + Statistics
[8dfed614] + Test
[cf7118a7] + UUIDs
[4ec0a83e] + Unicode
(demo-GridWorlds) pkg> add ReinforcementLearningBase
Resolving package versions...
Updating `~/acads/projects/demo-GridWorlds/Project.toml`
[e575027e] + ReinforcementLearningBase v0.9.5
No Changes to `~/acads/projects/demo-GridWorlds/Manifest.toml`
julia> import GridWorlds
[ Info: Precompiling GridWorlds [e15a9946-cd7f-4d03-83e2-6c30bacb0043]
julia> import ReinforcementLearningBase
julia> env = GridWorlds.EmptyRoom()
Global view:
████████
█⋅⋅⋅⋅←⋅█
█⋅⋅⋅⋅⋅⋅█
█♥⋅⋅⋅⋅⋅█
█⋅⋅⋅⋅⋅⋅█
█⋅⋅⋅⋅⋅⋅█
█⋅⋅⋅⋅⋅⋅█
████████
Local view:
~█↓⋅⋅
~█⋅⋅⋅
~█⋅⋅⋅
~█⋅⋅⋅
~█⋅⋅♥
RLBase.reward(env): 0.0
RLBase.is_terminated(env): false
julia> import ReinforcementLearningBase: RLBase
julia> RLBase.reward(env)
0.0
julia> import Debugger
julia> Debugger.@enter RLBase.reward(env)
In reward(env, #unused#) at /home/sid/.julia/packages/GridWorlds/eq4Eg/src/abstract_grid_world.jl:82
>82 RLBase.reward(env::AbstractGridWorld, ::RLBase.DefaultPlayer) = env.reward
About to run: <(getproperty)(GridWorlds.EmptyRoom{Random._GLOBAL_RNG}(Bool[0 0 0 0 0 0 0 0; 1 1 1 1 1 1 1 1; 0 0 0 0...>
1|debug>
thanks a lot for the detailed reponse, I'll look into this today!
Hi!
I'm working on recreating some basic RL policies, and trying to use the common interface of Environment and it looks like RLBase.reward is not implemented for EmptyRoom.
I can open a pull request, but wanted to ask if it was intentional or not!
Thanks!