Open mbhushan opened 6 years ago
Value iteration:
Policy Iteration:
Policy improvement:
Reinforcement Learning: Dynamic Programming
Bellman expectation equation:
State Value Functions:
Gridworld Example:
Stochastic Policy:
Deterministic Policy:
Markov Decision Process:
Action and States:
Goals and Rewards for humanoids:
What are the states:
Reward Hypothesis:
Continuing Task:
Episodic Task:
Goal of the agent: Maximize cumulative expected Reward:
Reward, State, Action:
Value iteration:
Policy Iteration:
Policy improvement:
Reinforcement Learning: Dynamic Programming
Bellman expectation equation:
State Value Functions:
Gridworld Example:
Stochastic Policy:
Deterministic Policy:
Markov Decision Process:
Markov Decision Process:
Action and States:
Goals and Rewards for humanoids:
What are the states:
Reward Hypothesis:
Continuing Task:
Episodic Task:
Goal of the agent: Maximize cumulative expected Reward:
Reward, State, Action: