In your code (envs.py), I saw that you first use MaxAndSkipEnv() to wrap the environment, and then apply the sticky action.
However, in RND's author's code, I found that they first wrap the env by StickyActionEnv(), then wrap it by MaxAndSkipEnv(). So, it seems your agent will have more "sticky" actions. I think this makes things a little bit different.
Hi,
In your code (envs.py), I saw that you first use MaxAndSkipEnv() to wrap the environment, and then apply the sticky action. However, in RND's author's code, I found that they first wrap the env by StickyActionEnv(), then wrap it by MaxAndSkipEnv(). So, it seems your agent will have more "sticky" actions. I think this makes things a little bit different.