mbhushan / ml

0 stars 0 forks source link

Udacity: Deep Learning: Reinforcement Learning Framework #22

Open mbhushan opened 6 years ago

mbhushan commented 6 years ago

Value iteration:

screen shot 2018-05-05 at 5 22 29 pm

Policy Iteration:

screen shot 2018-05-05 at 5 04 19 pm screen shot 2018-05-05 at 5 05 25 pm screen shot 2018-05-05 at 5 12 59 pm screen shot 2018-05-05 at 5 20 39 pm screen shot 2018-05-05 at 5 20 07 pm screen shot 2018-05-05 at 5 19 22 pm

Policy improvement:

screen shot 2018-05-05 at 10 50 26 am screen shot 2018-05-05 at 10 52 27 am screen shot 2018-05-05 at 10 54 17 am

Reinforcement Learning: Dynamic Programming

screen shot 2018-05-01 at 8 19 55 am screen shot 2018-05-01 at 8 18 30 am screen shot 2018-05-01 at 8 18 24 am screen shot 2018-04-30 at 10 14 02 am screen shot 2018-04-30 at 10 13 17 am screen shot 2018-04-30 at 9 52 16 am screen shot 2018-04-30 at 9 51 40 am screen shot 2018-04-30 at 9 49 24 am screen shot 2018-04-30 at 9 49 15 am

Bellman expectation equation:

screen shot 2018-04-29 at 11 07 37 am screen shot 2018-04-29 at 11 13 31 am

State Value Functions:

screen shot 2018-04-29 at 10 59 49 am

Gridworld Example:

screen shot 2018-04-29 at 10 06 02 am

Stochastic Policy:

screen shot 2018-04-29 at 9 57 23 am screen shot 2018-04-29 at 9 58 07 am screen shot 2018-04-29 at 9 59 00 am

Deterministic Policy:

screen shot 2018-04-29 at 9 56 13 am

Markov Decision Process:

screen shot 2018-04-28 at 5 46 33 am

Markov Decision Process:

screen shot 2018-04-28 at 5 43 20 am screen shot 2018-04-28 at 5 04 27 am

Action and States:

screen shot 2018-04-28 at 4 58 44 am screen shot 2018-04-28 at 4 58 32 am

Goals and Rewards for humanoids:

screen shot 2018-04-27 at 10 05 44 am screen shot 2018-04-27 at 10 03 49 am screen shot 2018-04-27 at 10 00 40 am

What are the states:

screen shot 2018-04-27 at 9 57 24 am

Reward Hypothesis:

screen shot 2018-04-27 at 9 54 19 am

Continuing Task:

screen shot 2018-04-27 at 8 15 07 am

Episodic Task:

screen shot 2018-04-27 at 8 13 58 am screen shot 2018-04-27 at 8 12 53 am

Goal of the agent: Maximize cumulative expected Reward:

screen shot 2018-04-27 at 8 10 39 am

Reward, State, Action:

screen shot 2018-04-27 at 8 09 38 am screen shot 2018-04-27 at 8 08 13 am screen shot 2018-04-27 at 8 06 25 am