Add tests for pathmind simulations

slinlee commented 2 years ago

Add test and example of two reward term simulation
Add tests for alphas

slinlee commented 2 years ago

@brettskymind @maxpumperla I'm getting this error during training. I'm looking at a different PR, but I'm wondering if you know what to fix off the bat.

   obs, r, done, info = self.envs[i].step(actions[i])
2172
  File "/home/runner/work/nativerl/nativerl/nativerl/python/pathmind_training/environments.py", line 303, in step
2173
    reward = np.sum(reward_array * self.alphas * self.betas)
2174
TypeError: unsupported operand type(s) for *: 'dict_values' and 'float'

slinlee commented 2 years ago

@brettskymind @maxpumperla I'm getting this error during training. I'm looking at a different PR, but I'm wondering if you know what to fix off the bat.
   obs, r, done, info = self.envs[i].step(actions[i])
2172
  File "/home/runner/work/nativerl/nativerl/nativerl/python/pathmind_training/environments.py", line 303, in step
2173
    reward = np.sum(reward_array * self.alphas * self.betas)
2174
TypeError: unsupported operand type(s) for *: 'dict_values' and 'float'

nvm, I'm going to revert my switch from np.asarray() to np.fromstring()

PathmindAI / nativerl

Add tests for pathmind simulations #486