rcognita is a flexibly configurable framework for agent-enviroment simulation with a menu of predictive and safe reinforcement learning controllers
16
stars
7
forks
source link
Implement Monte-Carlo method and pipeline #59
Open
osinenkop opened 2 years ago
Need:
Visualizer: as always (like 3wrobot), but upper left screen: pendulum and its trajectory (dotted line like 3wrobot)
Monte-Carlo scenario:
Policy must be a PDF (probability distro func). Useful policy parametrizations -- see S&B, p. 322 book. REINFORCE algorithm can also be found there