Closed rlbeaverton closed 4 years ago
reported results are low-level DMC environment steps (1k per episode)
Quick follow-up after reading "Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels" by Kostrikov et al., in which they state:
In contrast to prior work, CURL [42] plots returns as a function of modified environment steps, i.e. true environment steps divided by the action-repeat hyper-parameter.
Is their assertion then wrong? Thanks!
we count environment steps (100k env steps = 25k agent steps with action repeat of 4), please refer to section 5.1 https://arxiv.org/pdf/2004.04136.pdf
Great work and thanks a lot for releasing the code! It’s awesome to see this simple contrastive loss term performing so well without the need for reconstruction.
Quick question regarding the environment step count: if we consider a DMC episode of standard length 1000 steps and we use a frameskip of 4, do the reported results consider the episode to have 1000 steps or 250 steps? Put differently, do the 100k step results mean 100k “low-level DMC” steps or 100k “agent-applying-an-action” steps?