philtabor / Youtube-Code-Repository

Repository for most of the code from my YouTube channel
864 stars 477 forks source link

noise size equal to number of actions #33

Open SarodYatawatta opened 3 years ago

SarodYatawatta commented 3 years ago

https://github.com/philtabor/Youtube-Code-Repository/blob/733e4526f9920e5b710e29077fb85a457eec1ea9/ReinforcementLearning/PolicyGradient/TD3/td3_torch.py#L163

Instead of a scalar noise, should be a vector of number of actions size mu_prime = mu + T.tensor(np.random.normal(scale=self.noise,size=(self.n_actions,)),

philtabor commented 3 years ago

We're allowed to add a scalar quantity to a vector. Is there a reason why each component of the mu tensor should have a different random number added to it?

SarodYatawatta commented 3 years ago

True, but by making mu tensor perturb by different random numbers, you can increase the exploration (as opposed to using the same random number)