Borrowed monitoring wrapper from OpenAI's Baselines to log progress of training.
Modified the wrappers such that it now accommodates stacked frames #9 , and outputs them as a LazyFrames object. Axes of the data is appropriately swapped for PyTorch i.e. (no. of channels)x(breadth)x(height)
Updated for compatibility with latest PyTorch versions. (more thorough than recommendations in #20)
Made changes such that the algorithm more closely matches that in Mnih et al. (2015) and other DQN literature:
Borrowed monitoring wrapper from OpenAI's Baselines to log progress of training.
Modified the wrappers such that it now accommodates stacked frames #9 , and outputs them as a LazyFrames object. Axes of the data is appropriately swapped for PyTorch i.e. (no. of channels)x(breadth)x(height)