sentenai / reinforce

Reinforcement learning in haskell
https://sentenai.github.io/reinforce/
BSD 3-Clause "New" or "Revised" License
44 stars 17 forks source link

add a testable "convergence" criteria #11

Open stites opened 6 years ago

stites commented 6 years ago

This would allow us to start doing some convergence testing for #10

msaroufim commented 5 years ago

Could you elaborate on this? You mean checking if the diff in training error between two successive runs is less than some epsilon? Or do you have something else in mind?

stites commented 5 years ago

Yeah, that was the rough idea -- maybe looking at the last N-run (or doing some sort of significance testing RE: Deep Reinforcement Learning that Matters).