It may make sense to add wrappers for the action selection, e.g. sticky actions/action repeats. This will rely on the info provided in the action space specification and may require storing the previous action in the state dictionary. Alternatively, this may be left to the user. In numpy this would look like this:
def act(a, last_action, sticky_action_prob):
if (np.random.rand() < sticky_action_prob):
a = last_action
last_action = a
return env.act(a), last_action
It may make sense to add wrappers for the action selection, e.g. sticky actions/action repeats. This will rely on the info provided in the action space specification and may require storing the previous action in the state dictionary. Alternatively, this may be left to the user. In numpy this would look like this: