HW 2 - Q learning vs Sarsa

nickumia commented 1 year ago

Notes:

Use the same Gridworld environment. Show the picture of your own Grid world including start/final state, blocks.
- [x] Complete the program code (refer to hw2-code.py file).
- [x] Complete GridEnv class
  - [x] Insert all information of your Grid environment in class GridEnv
  - [x] constructor (init), returns start state of your grid.
  - [x] implement step function in class GridEnv. ‘step’ function reads action and returns next-state, reward, done, info
  - [x] implement reset function in class GridEnv. ‘reset’ function moves agent back to start state. ‘reset’ function returns start state.
Implement Q learning - You have to repeat the loop part of pseudo code in slide p. 33 using many episodes.
- [x] for each episode
  - [x] For each step in episode
    - [x] Update Q value using q learning formula in slide p. 33
  - [x] show the results of Q values
  - [x] show the results of policy
Implement Sarsa - You have to repeat the loop part of pseudo code in slide p. 22 using many episodes.
- [x] for each episode
  - [x] For each step in episode
  - [x] Update Q value using Sarsa formula in slide p. 22
- [x] show the results of Q values
- [x] show the results of policy

nickumia commented 1 year ago

The code for this works (and, presumably, is complete). However, the logic should be documented because I can't say that I fully understand how I got it to work haha..

nickumia commented 1 year ago

Initial Report is uploaded.. but further documentation will be created in the future. 😮‍💨

nickumia / cap6629

HW 2 - Q learning vs Sarsa #7