Bam4d / Griddly

A grid-world game engine for game AI research
https://griddly.readthedocs.io
MIT License
233 stars 24 forks source link

GDY game functions are triggering in a undesirable order #222

Open aadharna opened 2 years ago

aadharna commented 2 years ago

So, this started with me trying to figure out if the RANGE_BOX_AREA was actually a box because I was noticing some weird behavior where the spider was nearby (within a box of size 2 from) the gnome while the gnome's light was on but the reward I was getting was as if the spider was not nearby (see picture).

I have three functions: a: exposed to the user; flips a local agent boolean b: Internal; checks a TRIGGER call, flips a local agent boolean. called when the spider is nearby c: Internal; returns reward to the user based off of above variables; called everyframe

I eventually noticed that the functions being called were doing so in a bad ordering. I need the function to go a -> b -> c, but griddly runs them as a -> c -> b

If I call function c from b, the ordering is correct, but because the trigger only goes off when the object in question is in range, then I cannot provide reward for if the object is out of range (a necessary case with negative rewards). If you call c from b, then while the spider is not nearby, function c will never get called which means that the gnome can freely turn/leave on its light without punishment.

I also tried calling the trigger function directly from function a, but that didn't do anything to change the behaviour.

Screenshots

spider_IS_nearby_noGriddly <-- this should never have been possible as the spider is always within a box of size 2 in this grideworld.

bad_ordering_of_fns

Desktop (please complete the following information):

yaml:

Version: "0.1"
Environment:
  Name: Particle Sensor Game
  Description: Multiple bang-bang sensors track a particle
  Observers:
    Block2D:
      TileSize: 24
    Isometric:
      TileSize: [ 32, 48 ]
      IsoTileHeight: 16
      IsoTileDepth: 4
      BackgroundTile: oryx/oryx_iso_dungeon/grass-1.png
    Vector:
      IncludePlayerId: true
  Player:
    AvatarObject: gnome
    Count: 1
    Observer:
      TrackAvatar: true
      Height: 5
      Width: 5
      OffsetX: 0
      OffsetY: 0
  Levels:
    - |
      s  .  .   .  .  
      .  .  .   .  . 
      .  .  g   .  .  
      .  .  .   .  .  
      .  .  .   .  .

    - |
      .  .  .  .  .  .  .  .  .  . 
      .  .  .  .  .  .  .  .  .  . 
      .  .  .  .  .  s  .  g  .  . 
      .  .  .  .  .  .  .  .  .  . 
      .  .  .  .  .  .  .  .  .  .
  Termination:
    Win:
      - eq: [spider:count, 0]

Actions:

  - Name: spider_random_movement
    InputMapping:
      Internal: true
    Behaviours:
      - Src:
          Object: spider
          Commands:
            - mov: _dest
            - exec:
                Action: spider_random_movement
                Randomize: true
                Delay: 1
        Dst:
          Object: [_empty, gnome]
      - Src:
          Object: spider
          Commands:
            - exec:
                Action: spider_random_movement
                Randomize: true
                Delay: 1
        Dst:
          Object: _boundary
      - Src:
          Object: spider
          Commands:
           - remove: true
           - reward: 1
        Dst:
          Object: right_exit

#  - Name: move
#    Behaviours:
#      - Src:
#          Object: gnome
#          Commands:
#            - mov: _dest
#        Dst:
#          Object: _empty

  - Name: switch
    InputMapping:
      Inputs:
        1:
          Description: flip switch
          VectorToDest: [ 0, 0 ]
      Relative: true
    Behaviours:
      # turn on spotlight
      - Src:
          Object: gnome
          Preconditions:
            - eq: [spotlight, 0]
          Commands:
            - set: [spotlight, 1]
            - set_tile: 1
            - print: "a: spotlight on"
#             - exec:
#                 Action: count_nearby_spider
        Dst:
          Object: gnome

      # turn off spotlight
      - Src:
          Object: gnome
          Preconditions:
            - eq: [spotlight, 1]
          Commands:
            - set: [spotlight, 0]
            - set_tile: 0
            - print: "a: spotlight off"
#             - exec:
#                 Action: count_nearby_spider
        Dst:
          Object: gnome

    # this does not capture if the spider is not nearby which is important
  - Name: count_nearby_spider
    Probability: 1.0
    Trigger:
      Type: RANGE_BOX_AREA
      Range: 2
    Behaviours:
      # If the spider is within 2 of the gnome and the gnome is on, give point
      - Src:
          Object: gnome
#          Preconditions:
#            - eq: [spotlight, 1]
          Commands:
            - if:
                Conditions:
                  eq:
                    - spotlight
                    - 1
                OnTrue:
                  - set: [spider_counter, 1]
                  - print: "b: spider nearby"
                OnFalse:
                  - set: [spider_counter, 0]
                  - print: "b: nearby spider not seen"
#            - exec:
#                Action: give_feedback
#                ActionId: 1
        Dst:
          Object: spider

  - Name: give_feedback
    InputMapping:
      Inputs:
        '1':
          Description: provide feedback to the agent(s)
          VectorToDest:
            - 0
            - 0
      Internal: true
    Behaviours:
      - Src:
          Object: gnome
          Preconditions:
            - eq:
                - spotlight
                - 1
          Commands:
            - if:
                Conditions:
                  eq:
                    - spider_counter
                    - 1
                OnTrue:
                  - reward: 1
                  - print: "c: spotlight on and spider nearby"
                OnFalse:
                  - reward: -1
                  - print: "c: spotlight on and spider not nearby"
            - exec:
                Action: give_feedback
                ActionId: 1
                Delay: 1
        Dst:
          Object: gnome
      - Src:
          Object: gnome
          Preconditions:
            - eq:
                - spotlight
                - 0
          Commands:
            - if:
                Conditions:
                  eq:
                    - spider_counter
                    - 1
                OnTrue:
                  - reward: 0
                  - print: "c: spotlight off and spider nearby"
                OnFalse:
                  - reward: 0
                  - print: "c: spotlight off and spider not nearby"
            - exec:
                Action: give_feedback
                ActionId: 1
                Delay: 1
        Dst:
          Object: gnome
Objects:
  - Name: gnome
    Z: 2
    MapCharacter: g
    InitialActions:
      - Action: give_feedback
        ActionId: 1
        Delay: 2
    Variables:
      - Name: spotlight
        InitialValue: 0
      - Name: spider_counter
        InitialValue: 0
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/avatars/gnome-1.png
        - Image: oryx/oryx_iso_dungeon/avatars/spider-fire-1.png
      Block2D:
        - Shape: square
          Color: [ 0.0, 0.8, 0.0 ]
          Scale: 0.5
        - Shape: triangle
          Color: [0.0, 0.5, 0.2]
          Scale: 0.8

  - Name: spider
    Z: 1
    InitialActions:
      - Action: spider_random_movement
        Randomize: true
    MapCharacter: s
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/avatars/spider-1.png
      Block2D:
        - Shape: triangle
          Color: [ 0.2, 0.2, 0.9 ]
          Scale: 0.5

  - Name: water
    MapCharacter: w
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/water-1.png
          Offset: [0, 4]
          TilingMode: ISO_FLOOR
      Block2D:
        - Color: [ 0.0, 0.0, 0.8 ]
          Shape: square

  - Name: right_exit
    MapCharacter: e
    Observers:
      Isometric:
        - Image: oryx/oryx_iso_dungeon/water-1.png
          Offset: [0, 4]
          TilingMode: ISO_FLOOR
      Block2D:
        - Color: [ 0.0, 0.0, 0.8 ]
          Shape: square

runner script

import os

from griddly import GymWrapperFactory, gd, GymWrapper
from griddly.RenderTools import VideoRecorder

if __name__ == "__main__":
    wrapper = GymWrapperFactory()

    name = "random_spiders_env"

    current_path = os.path.dirname(os.path.realpath(__file__))

    env = GymWrapper(
        "sensor_game_single_agent.yaml",
        player_observer_type=gd.ObserverType.VECTOR,
        global_observer_type=gd.ObserverType.ISOMETRIC,
        level=1,
        max_steps=1000,
    )
    env.enable_history(True)

    env.reset()

    global_recorder = VideoRecorder()
    global_visualization = env.render(observer="global", mode="rgb_array")
    global_recorder.start("global_video_test.mp4", global_visualization.shape)

    for i in range(1000):
        a = env.action_space.sample()
        obs, reward, done, info = env.step(a)

        env.render(observer="global")
        frame = env.render(observer="global", mode="rgb_array")

        global_recorder.add_frame(frame)

        if done:
            env.reset()

    global_recorder.close()
    env.close()
aadharna commented 2 years ago

I have only slightly delved into the cpp, but I did find this class on queuing actions. Would it be possible to expose the priority variable into the yaml and give functions in the GDY different priorities which would then allow us to enforce an ordering on how the game mechanics functions are ordered?

aadharna commented 2 years ago

Created a band-aid fix wrapper that takes two steps for every one controlled step and that second step is a no-op. Also, the particle/spider only moves every second step now to make this (probably) work? I haven't come across a bug with it yet.

class ExtraStepHackSingleAgent(gym.Wrapper):
    def __init__(self, env, env_config):
        gym.Wrapper.__init__(self, env)
        self.action_space = env.action_space
        self.observation_space = env.observation_space
        # this doesn't seem to do anything...
        self.env.gdy.set_max_steps(2 * env_config.get('max_steps', 250))

    def step(self, action):
        ns, r, d, i = self.env.step(action)
        if not d:
            nns, rr, dd, ii = self.env.step(0)
            ii[-1] = {
                'ns': ns,
                'r': r,
                'd': d,
                'i': i
            }
            return nns, rr, dd, ii
        return ns, r, d, i

    def reset(self, **kwargs):
        return self.env.reset(**kwargs)