nickumia / cap6629

A summary of Reinforcement Learning techniques explored in Dr. Lee's class
GNU General Public License v3.0
0 stars 0 forks source link

HW 2 - Q learning vs Sarsa #7

Closed nickumia closed 1 year ago

nickumia commented 1 year ago

Notes:

  1. Use the same Gridworld environment. Show the picture of your own Grid world including start/final state, blocks.
    • [x] Complete the program code (refer to hw2-code.py file).
    • [x] Complete GridEnv class
      • [x] Insert all information of your Grid environment in class GridEnv
      • [x] constructor (init), returns start state of your grid.
      • [x] implement step function in class GridEnv. ‘step’ function reads action and returns next-state, reward, done, info
      • [x] implement reset function in class GridEnv. ‘reset’ function moves agent back to start state. ‘reset’ function returns start state.
  2. Implement Q learning - You have to repeat the loop part of pseudo code in slide p. 33 using many episodes.
    • [x] for each episode
      • [x] For each step in episode
        • [x] Update Q value using q learning formula in slide p. 33
      • [x] show the results of Q values
      • [x] show the results of policy
  3. Implement Sarsa - You have to repeat the loop part of pseudo code in slide p. 22 using many episodes.
    • [x] for each episode
      • [x] For each step in episode
      • [x] Update Q value using Sarsa formula in slide p. 22
    • [x] show the results of Q values
    • [x] show the results of policy
nickumia commented 1 year ago

The code for this works (and, presumably, is complete). However, the logic should be documented because I can't say that I fully understand how I got it to work haha..

nickumia commented 1 year ago

Initial Report is uploaded.. but further documentation will be created in the future. 😮‍💨