JuliaReinforcementLearning / GridWorlds.jl

Help! I'm lost in the flatland!
MIT License
46 stars 9 forks source link

add SingleRoomUndirected & SingleRoomDirected #153

Closed Sid-Bhatia-0 closed 3 years ago

Sid-Bhatia-0 commented 3 years ago

@findmyway I have rethought the design of this package to achieve our various goals, including decoupling with respect to the RLBase API. Here's what I think:

RL environments are games first before they are environments. For GridWorlds.jl we don't need to create yet another powerful RL API like RLBase, we just need to be able to support RLBase (or CommonRLInterface or anything else for that matter) on top of a lightweight API that only covers running the core logic of all the grid-world games.

It turns out that all the game logic nicely fits into two methods - reset! and act!. Other than the core logic, the GridWorlds.jl API doesn't need to explicitly provide methods to get the state, for example, because states are only defined in the context of RL, and the ability to get a state via a method is not part of the core logic of a grid world game. It just needs to be run the game logic, and all the RL related methods will be implemented by the concerned RL API like RLBase (we will provide the implementations for RLBase, just that they will be decoupled from the core game logic, which will be governed by the GridWorlds.jl specific methods GW.reset! and GW.act!).

In this way, we can later support any API apart from RLBase, like CommonRLInterface as well.

For this, I am adding a new abstract type AbstractGridWorldGame <: Any instead of the current AbstractGridWorld <: RLBase.AbstractEnv. All the RLBase API methods for all the games will live in a separate module called RLBaseGridWorldModule.

On separation of environments, each grid-world game will be in a separate module. For example, SingleRoomUndirected is present in SingleRoomUndirectedModule and SingleRoomDirected is in SingleRoomDirectedModule. This allows for each environment to define its own consts that are isolated from other environments.

Another interesting thing to notice is that directed and undirected environments differ only in the way navigation occurs, and the rest of the logic is the same for the most part (in the same type of environment like SingleRoom). Also, undirected navigation is the more primitive/simpler one as it doesn't need to keep track of direction. So directed environments can reuse significant functionality from undirected environments. An example is SingleRoomUndirected and SingleRoomDirected as in this PR.

codecov-commenter commented 3 years ago

Codecov Report

Merging #153 (8bd9159) into master (6d87e18) will decrease coverage by 2.32%. The diff coverage is 54.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #153      +/-   ##
==========================================
- Coverage   78.21%   75.89%   -2.33%     
==========================================
  Files          21       25       +4     
  Lines        2226     2468     +242     
==========================================
+ Hits         1741     1873     +132     
- Misses        485      595     +110     
Impacted Files Coverage Δ
src/GridWorlds.jl 100.00% <ø> (ø)
src/abstract_grid_world.jl 14.28% <0.00%> (-60.72%) :arrow_down:
src/play.jl 0.00% <0.00%> (ø)
src/rlbase.jl 64.70% <64.70%> (ø)
src/envs/single_room_directed.jl 71.42% <71.42%> (ø)
src/envs/single_room_undirected.jl 79.10% <79.10%> (ø)
src/actions.jl 83.33% <86.95%> (+4.38%) :arrow_up:
src/directions.jl 100.00% <100.00%> (ø)
src/envs/envs.jl 100.00% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6d87e18...8bd9159. Read the comment docs.

findmyway commented 3 years ago

cool, glad we reach an agreement on it ;)

Sid-Bhatia-0 commented 3 years ago

Here is an example of SingleRoomUndirected.

single_room_undirected

Here is an example of SingleRoomDirected.

single_room_directed

There is no need for colors in this environment since all objects are unique.

Sid-Bhatia-0 commented 3 years ago

Here is an example of being able to replay animations at a given frame_rate as well as stepping through the animation manually (being able to go to the next frame, go to the previous frame, go to the first frame as many times as the user likes).

replay