action[0], action [1] are continuous.
I would like to have action[2] discrete and I split the domain [-1,1] in 5 equally spaced chunks.
because of action clipping my intuition is that the first and the last chunks are favored, is my intuition correct? Should I build a one hot encoder or something similar to prevent the issue?
Checklist
[X] I have checked that there is no similar issue in the repo
❓ Question
I am running a PPO on a custom gymnasium environment where I define the actions in the following way :
self.action_space = spaces.Box(low=-1, high=1, shape=(3,), dtype=np.float32)
action[0], action [1] are continuous. I would like to have action[2] discrete and I split the domain [-1,1] in 5 equally spaced chunks.
because of action clipping my intuition is that the first and the last chunks are favored, is my intuition correct? Should I build a one hot encoder or something similar to prevent the issue?
Checklist