higgsfield / RL-Adventure

Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL
2.99k stars 587 forks source link

Updated 1.dqn for compatability with PyTorch 0.4 and 1.0 #24

Open joleeson opened 5 years ago

joleeson commented 5 years ago
  1. Updated for compatibility with latest PyTorch versions. (more thorough than recommendations in #20)

    • no longer uses the deprecated "Variable" class
    • use of appropriate dtypes
    • cpu/gpu agnostic code
    • use of tensor.item() for conversion of 0-dimensional tensors to ordinary python numbers
  2. Made changes such that the algorithm more closely matches that in Mnih et al. (2015) and other DQN literature:

    • linear epsilon decay
    • frame stacking
    • training frequency is now once every 4 steps in the environment for Atari env
    • option of using Huber loss instead of RMS loss in def compute_td_loss()
  3. Borrowed monitoring wrapper from OpenAI's Baselines to log progress of training.

  4. Modified the wrappers such that it now accommodates stacked frames #9 , and outputs them as a LazyFrames object. Axes of the data is appropriately swapped for PyTorch i.e. (no. of channels)x(breadth)x(height)