Why is start_of_episode = False when env is reset?

google-deepmind / dqn_zoo

DQN Zoo is a collection of reference implementations of reinforcement learning agents developed at DeepMind based on the Deep Q-Network (DQN) agent.

Apache License 2.0

451 stars 78 forks source link

Why is start_of_episode = False when env is reset? #26

Closed raymondchua closed 10 months ago

raymondchua commented 12 months ago

Hi, Can someone explain why the start of the episode is false when the GymAtari environment is reseted? Shouldn't it be true?

https://github.com/google-deepmind/dqn_zoo/blob/8728c674420725f49f6dec9b893f9420fd5cd2ef/dqn_zoo/gym_atari.py#L77C1-L77C1

jqdm commented 12 months ago

The line

self._start_of_episode = False

is set at the end of reset() to indicate to the next call to step() it is no longer the start of the episode; we have returned the first timestep. The comments here explains in more detail why this internal variable exists.