I set a customized environment with a discrete type observation space. E.g.,
self.obervation_space = Dict({'A': Discrete(2), 'B': Discrete(3)})
In my action space, each value can be either +1, 0, or -1. If the next state is not contained within the designated observation space, I will apply a penalty of -10 as the reward.
But when I use check_env function to check my environment, there is an error "AssertionError: Error while checking key=A: The observation returned by the step() method does not match the given observation space Discrete(2)".
I know the problem occurs when the next state exceeds the existing designated observation space, but I would like to use the reward function to penalize the agent in such cases.
So how do I solve the problem? In other words, how do I avoid the scenario when the next state is not contained in the space?
Hi,
I set a customized environment with a discrete type observation space. E.g.,
self.obervation_space = Dict({'A': Discrete(2), 'B': Discrete(3)})
In my action space, each value can be either +1, 0, or -1. If the next state is not contained within the designated observation space, I will apply a penalty of -10 as the reward. But when I usecheck_env
function to check my environment, there is an error "AssertionError: Error while checking key=A: The observation returned by thestep()
method does not match the given observation space Discrete(2)". I know the problem occurs when the next state exceeds the existing designated observation space, but I would like to use the reward function to penalize the agent in such cases. So how do I solve the problem? In other words, how do I avoid the scenario when the next state is not contained in the space?Thank you very much.