pranz24 / pytorch-soft-actor-critic

PyTorch implementation of soft actor critic
MIT License
822 stars 182 forks source link

Why do you need to use NormalizedActions()? #6

Closed JingJerry closed 5 years ago

JingJerry commented 5 years ago

Excuse me, I don't understand that why do you need to use NormalizedActions()? Can you explain it ? Thank you!

Environment

env = NormalizedActions(gym.make(args.env_name))

class NormalizedActions(gym.ActionWrapper):

def action(self, action):
    action = (action + 1) / 2  # [-1, 1] => [0, 1]
    action *= (self.action_space.high - self.action_space.low)
    action += self.action_space.low
    return action

def _reverse_action(self, action):
    action -= self.action_space.low
    action /= (self.action_space.high - self.action_space.low)
    action = action * 2 - 1
    return action
pranz24 commented 5 years ago

It normalizes actions to the range -> (action_space.high, action_space.low)

JingJerry commented 5 years ago

Why do you need to normalizes actions ?

pranz24 commented 5 years ago

For example, Humanoid-v2 env has action_space.high, action_space.low = 0.4, -0.4 The neural network outputs actions in -1 to 1 range -> {tanh(actions)} if the actions output of the network is [1, 0.8, -0.7, -1] then NormalizedActions will normalize it between 0.4, -0.4 i.e. [0.4, 0.32, 0.28, -0.4]. To scale the actions between action_space.high and action_space.low we use NormalizedActions()

JingJerry commented 5 years ago

I see. Thanks!