datawhalechina / easy-rl

强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
Other
9.04k stars 1.81k forks source link

Spelling mistake #122

Open d3ac opened 1 year ago

d3ac commented 1 year ago

There is a spelling mistake in the code "MonteCarlo.ipynb" (class "FisrtVisitMC" -> "FirstVisitMC").

d3ac commented 1 year ago

Thanks for your code. I have a little suggestion : move "agent.update(one_ep_transition)" out of the loop, then it will be at least 60 times faster than before. In practice, I think there is no need to update agent in the loop since it will bring high time complexity ($O(n^2)$). As I tried, I got a obvious better convergence value and faster speed. I wonder if it is feasible. I would appreciate it if you could solve my problem.