Open returnZeroMan opened 2 years ago
Hey, thanks for the feedback, I'm glad its helpful. You can use a Gym dict space for both actions or observations and it acts just like a python dictionary in that it allows you to compose a more complex representation, made up of other Gym space objects. For example you could:
class MultiDiscreteActionsD2DEnv(D2DEnv):
def __init__(self, env_config=None) -> None:
super().__init__(env_config)
self.action_space = spaces.Dict({
'due': spaces.MultiDiscrete([self.simulator.config.num_rbs, self.num_pwr_actions['due']]),
'cue': spaces.Discrete(self.simulator.config.num_rbs * self.num_pwr_actions['cue']),
'mbs': spaces.Discrete(self.simulator.config.num_rbs * self.num_pwr_actions['mbs']),
})
which you can access like:
space = env.action_space['due']
to configure your neural nets in DRL or whatever optimisation method you are using.
Hopefully soon I'll get some time to put together an example repo to show how you can use RLLib with GymD2D, but for now hopefully this is enough to get you started.
After reading your paper, I can't use SAC algorithm to realize it. Could you please provide your SAC algorithm? It is only for personal learning.
Thank you for your contribution, which is very helpful for my study, but I don't know how to use dictionary action space, can you provide a sample of DRL? Thank you very much!!!