nickumia / cap6629

A summary of Reinforcement Learning techniques explored in Dr. Lee's class
GNU General Public License v3.0
0 stars 0 forks source link

HW3 - DQN & REINFORCE #11

Closed nickumia closed 10 months ago

nickumia commented 1 year ago

1-1. Sample Code Using Tensorflow

  1. If you want to use Tensorflow:
    • [ ] Make sure Tensorflow is installed in your computer
    • [ ] Choose one of the sample code .
    • [x] Install required packages in the sample code.
    • [x] [10pts] Run the program and check whether it works correctly.

1-2. Sample Code Using PyTorch

  1. If you are familiar with PyTorch, you can use PyTorch program code.
    • [ ] You will use the PyTorch library. To get started, follow the instructions to install PyTorch and then, go through a tutorial about the basics of PyTorch. Refer to https://pytorch.org/
    • [ ] Choose one of the sample code.
    • [ ] Install required packages in the sample code.
    • [ ] [10pts] Run the program and check whether it works correctly.

2-1. Run the program with different parameters

  1. After you make sure your program works, do the following. Show the performance each parameter value(e.g., mean accumulated episode reward, size of error, etc).
    • [x] [5pts] run the program 5 times with different size of replay memory.
    • [x] [4pts] run the program using different gradient policy including adadelta, adagrad, adam(or adamw), or RMSProp.
    • [x] [6pts] change the way target network parameters are updated. From Polyak method to periodic update, or vice versa
    • [x] change Q network configuration (make sure you change target network as well)
      • [x] [6pts] add additional layers
      • [x] [6pts] change activation function from ReLU to others
    • [ ] [6pts] change error function with and without entropy term
    • [ ] [8pts] change CNN to neural network, or vice ver

3-1. In your homework report:

4-1. (extra points, max: 8pts) Policy gradient method (REINFORCE algorithm)

  1. This problem is about implementing policy gradient method REINFORCE algorithm.
    • [ ] Use one the following sample code
      • [ ] listing7_1_reinforce_pytorch.ipynb OR
      • [ ] listing7_1_reinforce_tensorflow.ipynb
    • [ ] [2pts] run the program. Show it works.
    • [ ] [2pts/each] perform 4) and 5) of Q. 2
nickumia commented 12 months ago

Homework started here.

nickumia commented 11 months ago

More time is required for this 😞

nickumia commented 10 months ago

This class failed hard 😵‍💫 I still got an A, but I didn't get as much out of it as I wanted to. And with my current level of motivation, I won't be making much progress here.