Open aadharna opened 2 years ago
I have only slightly delved into the cpp, but I did find this class on queuing actions. Would it be possible to expose the priority
variable into the yaml and give functions in the GDY different priorities which would then allow us to enforce an ordering on how the game mechanics functions are ordered?
Created a band-aid fix wrapper that takes two steps for every one controlled step and that second step is a no-op. Also, the particle/spider only moves every second step now to make this (probably) work? I haven't come across a bug with it yet.
class ExtraStepHackSingleAgent(gym.Wrapper):
def __init__(self, env, env_config):
gym.Wrapper.__init__(self, env)
self.action_space = env.action_space
self.observation_space = env.observation_space
# this doesn't seem to do anything...
self.env.gdy.set_max_steps(2 * env_config.get('max_steps', 250))
def step(self, action):
ns, r, d, i = self.env.step(action)
if not d:
nns, rr, dd, ii = self.env.step(0)
ii[-1] = {
'ns': ns,
'r': r,
'd': d,
'i': i
}
return nns, rr, dd, ii
return ns, r, d, i
def reset(self, **kwargs):
return self.env.reset(**kwargs)
So, this started with me trying to figure out if the RANGE_BOX_AREA was actually a box because I was noticing some weird behavior where the spider was nearby (within a box of size 2 from) the gnome while the gnome's light was on but the reward I was getting was as if the spider was not nearby (see picture).
I have three functions: a: exposed to the user; flips a local agent boolean b: Internal; checks a TRIGGER call, flips a local agent boolean. called when the spider is nearby c: Internal; returns reward to the user based off of above variables; called everyframe
I eventually noticed that the functions being called were doing so in a bad ordering. I need the function to go a -> b -> c, but griddly runs them as a -> c -> b
If I call function c from b, the ordering is correct, but because the trigger only goes off when the object in question is in range, then I cannot provide reward for if the object is out of range (a necessary case with negative rewards). If you call c from b, then while the spider is not nearby, function c will never get called which means that the gnome can freely turn/leave on its light without punishment.
I also tried calling the trigger function directly from function a, but that didn't do anything to change the behaviour.
Screenshots
<-- this should never have been possible as the spider is always within a box of size 2 in this grideworld.
Desktop (please complete the following information):
yaml:
runner script