hardmaru / estool

Evolution Strategies Tool
Other
933 stars 162 forks source link

in PEPG module change_mu = np.dot(rT, epsilon) where left side have dimention pop_size and right have dimention batch_size(half of pop_size) #24

Closed geotyper closed 5 years ago

geotyper commented 5 years ago
      rT = (reward[:self.batch_size] - reward[self.batch_size:])
      change_mu = np.dot(rT, epsilon)
      self.optimizer.stepsize = self.learning_rate
      update_ratio = self.optimizer.update(-change_mu) # adam, rmsprop, momentum, etc.
      #self.mu += (change_mu * self.learning_rate) # normal SGD method

so change_mu will be half shorter than need for pop_size

geotyper commented 5 years ago

I found my error