hpi-sam / rl-4-self-repair

Reinforcement Learning Models for Online Learning of Self-Repair and Self-Optimization
MIT License
0 stars 1 forks source link

Rewards x Episodes chart #8

Open christianadriano opened 4 years ago

christianadriano commented 4 years ago

@2start @MrBanhBao Could you please think about Rewards x Episodes chart? What should we expect to see in terms of reward as the algorithm narrows down to the optimal policy? Is it showing the cumulative reward for each episode?