kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
MIT License
2.33k stars 440 forks source link

MultiDiscrete Action Space #71

Closed hamzaali98 closed 4 months ago

hamzaali98 commented 7 months ago

I am working with this gym environment mobile-env which has a MultiDiscrete Action Space. I am wondering is there a way we can change the way we can use the gym implementation which produces an array of continuous outputs into a MultiDiscrete output?

Any sort of help/suggestion would be appreciated! :)

kzl commented 4 months ago

Hi, the simplest way to do this is probably to just expand the output dim of the action by a bunch, and then reshape the output predictions into the shape you need. You can also replace the loss function for training to match.

Alternatively, you could add multiple action tokens, one for each dimension of your action space.