JuliaReinforcementLearning / GridWorlds.jl

Help! I'm lost in the flatland!
MIT License
46 stars 9 forks source link

add CollectGemsUndirectedMultiAgent environment #143

Closed Sid-Bhatia-0 closed 3 years ago

Sid-Bhatia-0 commented 3 years ago
  1. Add the REPL package to play environment inside Julia REPL.
  2. Create the CollectGemsUndirectedMultiAgent environment.
  3. Add ability to play this environment in the terminal.
  4. Fix some issues related to Makie rendering. (It barely works. Needs to be replaced with REPL-based workflow.)

Here's how it looks:

collect_gems_undirected_multi_agent

Sid-Bhatia-0 commented 3 years ago

Needs some cleaning up and add recording feature.

findmyway commented 3 years ago

Based on the GIF, the players seem to be playing alternatively. I think the simultaneous play would be more intuitive here. What do you think?

Sid-Bhatia-0 commented 3 years ago

Based on the GIF, the players seem to be playing alternatively.

Yes

I think the simultaneous play would be more intuitive here. What do you think?

Yes. I also feel that simultaneous play would be more intuitive if I imagine this kind of situation happening in the real world. I implemented turn-based play because of the following reasons:

  1. It is simpler in terms of environment logic. Logic for simultaneous play is a little more involved. Things like what if multiple agents try to get to the same cell to collect a gem? Some cases might inevitably require either some form of implicit tie-breaking or a more comlex logic (maybe you don't move any of the agents if they are trying to get to the same cell (although this isn't a satisfactory solution)). So I thought it is better for now to have sequential play since it has an explicit ordering.
  2. It is easier to test the environment logic (there are fewer cases to check). Also, it is easier to play and test in an interactive fashion like in REPL game where you only select and execute one action for one agent at a time (rather than select 4 actions and execute them simultaneusly).
  3. My guess is that it would be easier to build and test an algorithm for sequential play later down the line. (I am not totally certain of this one, this is just my best guess).

If sequential play is good enough as a use case for demonstrating a multi-agent environment, I would prefer it to remain this way for now. Is that okay?

codecov-commenter commented 3 years ago

Codecov Report

Merging #143 (887618d) into master (c0e86bb) will decrease coverage by 2.67%. The diff coverage is 44.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
- Coverage   80.88%   78.21%   -2.68%     
==========================================
  Files          20       21       +1     
  Lines        2061     2226     +165     
==========================================
+ Hits         1667     1741      +74     
- Misses        394      485      +91     
Impacted Files Coverage Δ
src/envs/envs.jl 100.00% <ø> (ø)
src/graphical_rendering.jl 0.00% <0.00%> (ø)
src/envs/collect_gems_undirected_multi_agent.jl 44.51% <44.51%> (ø)
src/GridWorlds.jl 100.00% <100.00%> (ø)
src/grid_world_base.jl 26.92% <0.00%> (+0.94%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c0e86bb...887618d. Read the comment docs.

Sid-Bhatia-0 commented 3 years ago

recording

Added recording feature :) Recordings can be saved in plain text files and played back at an arbitrary frame rate using a function call.

Right now the REPL playing and recording is only available for this environment. I will generalize it and add this ability to all the environments soon.

There is another useful feature that I would like to add, which is the ability to step through the frames one by one so that we can pause and navigate frames as we like. Often times, when I want to visualize the behavior of an agent, I also need time to think and analyze what it is doing. It is hard to analyze and think when a gif is constantly moving. There might be external tools out there that help navigate frames of a gif, but it would be more convenient if we can do this inside the julia REPL itself. This features is a low hanging fruit. I'll add it later when I extend the the REPL-based interactivity to all environments.

findmyway commented 3 years ago

Based on the GIF, the players seem to be playing alternatively.

Yes

I think the simultaneous play would be more intuitive here. What do you think?

Yes. I also feel that simultaneous play would be more intuitive if I imagine this kind of situation happening in the real world.

I implemented turn-based play because of the following reasons:

  1. It is simpler in terms of environment logic. Logic for simultaneous play is a little more involved. Things like what if multiple agents try to get to the same cell to collect a gem? Some cases might inevitably require either some form of implicit tie-breaking or a more comlex logic (maybe you don't move any of the agents if they are trying to get to the same cell (although this isn't a satisfactory solution)). So I thought it is better for now to have sequential play since it has an explicit ordering.

  2. It is easier to test the environment logic (there are fewer cases to check). Also, it is easier to play and test in an interactive fashion like in REPL game where you only select and execute one action for one agent at a time (rather than select 4 actions and execute them simultaneusly).

  3. My guess is that it would be easier to build and test an algorithm for sequential play later down the line. (I am not totally certain of this one, this is just my best guess).

If sequential play is good enough as a use case for demonstrating a multi-agent environment, I would prefer it to remain this way for now. Is that okay?

Yes it is ok right now.