Do the V5 environments perform frame pooling?

danijar commented 2 years ago

The README says the V5 environments follow Machado et al. 2018, who use max-pooling over the last two frames. I couldn't find any code or config option of this in the repository, so I'm wondering do the V5 environments automatically perform frame pooling?

JesseFarebro commented 2 years ago

Hey @danijar, max-pooling isn't performed in the environment itself. We've gone back and forth whether these type of preprocessing steps should be in the environment or not. We ended up deciding that although doing preprocessing in the emulator would result in improved performance we don't want users to think that these preprocessing steps are the benchmark. Over time they've been confounded with the benchmark but they are primarily a result of design decisions made in 2015 specifically for DQN.

The v5 environments define the necessary environment specific parameters, e.g., environment stochasticity, episode truncation, etc. It could be said that max-pooling is necessary to have environment side (to solve flickering issues in some games) but all of the different forms of max-pooling have their own issues which I didn't want bake into the environment.

A couple of my thoughts specifically on max-pooling:

Historically, people have done max-pooling very differently. The official DeepMind implementations max-pool over RGB frames (channel-wise). From what I understand, the original DQN project had tried all variation of this (e.g., max-pool over raw frames, grayscale, and RGB). RGB performed best for them at the time so this is what stuck. Most open source implementations max-pool over grayscale frames.
I still strongly believe color averaging is better than max-pooling, there were some concerns about averaging and memory (max-pooling keeps things in 8-bit) but the way averaging is performed in the emulator actually keeps things in 8-bit regardless. If we were designing things from scratch today I'd push for this to be the preferred method.

Finally, if you are in need of a good preprocessing stack I'm a big fan of the abstractions made in DQN Zoo. Preprocessing is entirely implemented in agent space and is the most faithful to the original DQN preprocessing. It does use DeepMind Environment (which I prefer) but it would be fairly easy to adapt to Gym: https://github.com/deepmind/dqn_zoo/blob/807379c19f8819e407329ac1b95dcaccb9d536c3/dqn_zoo/processors.py#L405-L577.

I think going forward it might make sense to have an official preprocessing wrapper in ale-py for Gym and DeepMind environment so this whole preprocessing issue isn't so confusing.

Hope that helps!

danijar commented 2 years ago

Thanks for the detailed comment! I agree that averaging is nicer than taking the maximum, but at this point it's more important to be compatible with the vast existing literature than to have nicer preprocessing.

The problem I'm running into is that I think you can't efficiently perform pooling on the agent side because the maximum should be taken over two consecutive frames. With action repeat, the environment doesn't give me access to the actual last frame. Without action repeat, the environment renders at every step and thus 2x more than necessary. This makes the env step almost twice as slow and thus limits throughput. Do you have an idea?

Btw if it would be possible to add max pooling as a kwarg to the v5 envs, that would be amazing! It's basically guaranteed that a decent number of researchers will continue to use max pooling for comparability to the prior literature despite it not being implemented in the v5 envs, so providing an implementation as part of the env will consolidate more than if everybody writes their own.

danijar commented 2 years ago

Hi @JesseFarebro, do you have an idea for a workaround here? It's the only issue holding me back from switching to ale-py and the new V5 envs. To implement max pooling, I need to set action repeat to 1 and that causes a 2x slowdown due to unnecessary rendering of the frames 1 and 2 whereas max pooling under action repeat 4 only needs to look at the frames 3 and 4. Thanks!

(That said, I think having a standard implementation that everybody can use would still be valuable! It could also easily avoid the slowdown since it can be implemented inside the env.)

JesseFarebro commented 2 years ago

Hey @danijar, you make a good point about the excess renders. I'm guessing you want to stick to frame pooling and not switch over to averaging? If so, are you currently pooling over Grayscale or RGB?

I think pooling does make sense to have in the emulator and it's probably the lowest hanging fruit to improving emulation speed. I'm about to make a release of ale-py so I'll see if it can be done this release.

If you want to discuss further I'll be more responsive via email (see my profile).

danijar commented 2 years ago

Awesome, would be great if you can add it in the release! Both grayscale and RGB would be nice but grayscale is more important because that's the most common evaluation protocol.

danijar commented 1 year ago

Hi Jesse, do you have any update on whether frame pooling is implemented now? I would really like to upgrade to using ale_py and along that allow unpinning the old Gym version I'm stuck at because of this. Thanks!

JesseFarebro commented 1 year ago

Having looked into this a little more it should be feasible but will require reworking how we render frames. Right now every emulator step we perform a memory copy of the current frame buffer to the ALEScreen object. This isn't necessary as we can lazily retrieve the frame buffer when it's requested via ALEInterface::getScreen{Grayscale,RGB}. This should a) improve performance (going from frameskip memcpy operations per step to 0) and b) allow us to perform "lazy" frame maxing as the emulator stores both the current and previous frame.

Also, the only way this makes sense is if we make frame maxing and frame averaging mutually exclusive which should be fine.

danijar commented 1 year ago

That would be great! My implementation is currently 4x slower than the old one on top of atari_py and this is likely the reason. Adding this natively would allow me to switch over and use the ale_py version :)

JesseFarebro commented 1 year ago

Ended up getting everything working and it's now about ~25% faster than atari-py. Still have to write some additional tests but we should be able to release this in the next couple of days.

danijar commented 1 year ago

Fantastic, thanks!

hyzhak commented 4 months ago

it seems PR stuck in review :( any chance we could see it soon?

Farama-Foundation / Arcade-Learning-Environment

Do the V5 environments perform frame pooling? #467