jsphon / reinforcement_learning

Python Package For Reinforcement Learning
0 stars 0 forks source link

Update generate_experience #7

Closed jsphon closed 7 years ago

jsphon commented 7 years ago

Currently, this selected the next action randomly, but in q learning it should be using the q value function.

However, we probably don't want to call the q value function too much as it is a NNet call so we can't optimise this using numba.

We need to be able to specify what policy to use for generate_experience.