implement generalized policy iteration in `ValueFunction`

sritchie / scala-rl

Functional Reinforcement Learning in Scala.

https://www.scalarl.com

Apache License 2.0

26 stars 5 forks source link

implement generalized policy iteration in `ValueFunction` #40

Open sritchie opened 5 years ago

sritchie commented 5 years ago

Current in ValueFunction we have value iteration going... but we don't have a way to decide what to do at the end of each sweep, within a sweep, and across sweeps.

One idea would be to code specific optimizations. Another would be to code a set of functions that would show what happens at each level.