gutfeeling / practical_rl_for_coders

Learn reinforcement learning with Python, Gym and Keras.
8 stars 5 forks source link

Course Plan #21

Open gutfeeling opened 4 years ago

gutfeeling commented 4 years ago

Chapter 1

Lesson: OpenAI Gym Installation

Lesson: Jupyter Installation

Lesson: Setting up a RL problem

Exercise: Set up the MountainCar-v0 problem

Lesson: The Agent and its Environment

Exercise: In the MountainCar-v0 environment, print out the observation space. How many elements are there in the observation? Look up the GitHub Wiki and find out what the elements mean.

Lesson: Actions

Exercise: In the MountainCar-v0 environment, print out the action space. How many actions are possible in the environment? For each action, write a loop where that action is taken repeatedly for 30 steps, visualize what happens and try to guess what the action means. Confirm your guess by looking up the environment details in the GitHub Wiki.

Lesson: Rewards and Episodes

Exercise: Calculate the average rewards obtained in CartPole-v0 over 100 episodes if the agent does random actions all the time.

Lesson: Other reward functions

Lesson: Episodes

gutfeeling commented 4 years ago

Chapter 2

Lesson: Markov Decision Processes

Lesson: Policy

Exercise: Compute the average total rewards per episode for the epsilon opposite policy, with epsilon = 0.9. Where does it rank in terms of average reward compared to the random and the opposite policy?

Lesson: Value and Q value functions

Exercise: Calculate value functions and action value functions of states while following a epsilon-opposite policy.

Lesson: Optimal Policy

Lesson: How humans learn gives us intuition of how to get to the optimal policy

Lesson: GLIE Monte Carlo

gutfeeling commented 4 years ago

Next chapters

Chapter 3: GLIE Monte Carlo implementation in Python

Chapter 4: SARSA

Chapter 5: Function approximation: Fourier transform

Chapter 6: Neural Network Crash Course

Chapter 7: Function approximation: Neural Network

Chapter 8: Vanilla Policy Gradient

Chapter 9: PPO

Chapter 10: RL on Google Cloud

Chapter 11: DQN