RL course 2 Week 1 Blog

sezan92 / sezan92.github.io

1 stars 1 forks source link

RL course 2 Week 1 Blog #41

Open sezan92 opened 3 months ago

sezan92 commented 3 months ago

Objective

This issue is about writup on week 1 RL course 2 blog

sezan92 commented 3 months ago

Update 2024/07/30

[ ] started from Week 2. Definition of Monte Carlo method done.

ToDO

[ ] https://www.coursera.org/learn/sample-based-learning-methods/lecture/7F85i/using-monte-carlo-for-prediction

sezan92 commented 3 months ago

Update 2024/08/09

[x] #42 started

TODO

[ ] complete it.

sezan92 commented 2 months ago

Update 2024/08/22

[x] video 2 done. could not get it well. check the book

TODO

[x] check the book on monte carlo black jack

sezan92 commented 2 months ago

Update 2024/08/28

[x] wrote intuition

TODO

[x] Start video 3

sezan92 commented 2 months ago

Update 2024/09/02

[x] started video 3 and 4. that means monte carlo action value update and general policy iteration

TODO

[x] update the intuition on penalty (slight wrong)
[ ] write up inteution
[ ]

sezan92 commented 1 month ago

2024/09/11

[x] updated the penalty example.
[x] started video on exploring starts, name "solving blackjack example".

TODO

[ ] complete and intuition on exploring starts, and epsilon soft policy in one go

sezan92 commented 1 month ago

Update 2024/09/24

[x] watched video https://www.coursera.org/learn/sample-based-learning-methods/lecture/zlI9l/epsilon-soft-policies

TODO

[x] make intuition on the exploring start vs epsilon soft policy

sezan92 commented 1 month ago

Update 2024/10/01

[x] Started intuition with Goal keeping

TODO

[x] complete it

sezan92 commented 2 weeks ago

Update 2024/10/18

[x] GK intuition for exploration done.

TODO

[ ] complete and start next video (after exploration)

sezan92 commented 2 weeks ago

Update 2024/10/22

[ ] need to explain for exploring start and epsilon soft policy

TODO

[ ] complete the intuition

sezan92 commented 1 week ago

Update 2024/10/31

[ ] started exploring start. need better intuition

TODO

[ ] explain