Closed cx441000319 closed 1 year ago
Hello, this is expected because of preprocessing for Atari games (the action repeat, aka frameskip, is set to 4 by default).
Related: https://github.com/DLR-RM/stable-baselines3/issues/181
Oh, that totally makes sense. I tried my best to check if there were any details I ignored, but I didn't realize it before. It's all clear now. Thank you so much for your quick reply!
❓ Question
Hi,
As the title says, it seems the issue only occurs in Atari. Here are some commands and images for reference:
Experiment Command: python train.py --algo ppo --env PongNoFrameskip-v4
Training Plotting Command: python scripts/plot_train.py -a ppo -e PongNoFrameskip-v4 -f logs
Evaluation Plotting Command: python scripts/all_plots.py -a ppo -e PongNoFrameskip-v4 -f logs --no-million -max 10000000
We can tell the number of the training timesteps is about 4e7 instead of 1e7 (n_timesteps in the hyper-parameters). The issue doesn't exist in the environments except for Atari based on my experiment results. If you want to reproduce the same issue, you can simply replace the hyper-parameter n_timesteps with a small number like 1e4 and you will find there are much more than 1e4 samples according to the episodic lengths in the logs.
Thank you so much in advance!
Checklist