Open pseudo-rnd-thoughts opened 11 months ago
A couple additional things:
ale_py.roms
ROM imports so users get proper type hints.multi-agent environments
Currently https://pypi.org/project/multi-agent-ale-py/ points to https://github.com/PettingZoo-Team/Multi-Agent-ALE (a fork of this repo; hasn't been updated for 9 months), which is then depended upon by pettingzoo
.
I think it would be good to clarify the structure of these repo-interactions, and ideally avoid depending on (half-?)dead forks. Given that this all seems to fall under the Farama umbrella, I hope that should both be possible, as well as in scope for such a roadmap. :)
multi-agent environments
Currently https://pypi.org/project/multi-agent-ale-py/ points to https://github.com/PettingZoo-Team/Multi-Agent-ALE (a fork of this repo; hasn't been updated for 9 months), which is then depended upon by
pettingzoo
.I think it would be good to clarify the structure of these repo-interactions, and ideally avoid depending on (half-?)dead forks. Given that this all seems to fall under the Farama umbrella, I hope that should both be possible, as well as in scope for such a roadmap. :)
The plan is to integrate the multi-agent-ale environments from that fork (which hasn't been updated in a while, as we are limited in C++ developers and people familiar with it) into the main ALE library here. Additionally, the environments will be moved out from PettingZoo and Gymnasium directly into this library, which I think will make things a lot cleaner.
I'm trying to tackle converting the envs to a gymnasium spec and including registration. I see the gym env natively has a frameskip option, but I see #495. Would removing the frameskip operation from within the env and out into a gymnasium wrapper / C-leval interface be a more sane approach? Same for repeat action probabilities.
A strong benefit of the native frameskip is that it can skip rendering the skipped frames. This results in a considerable rollout speedup. If frameskip is moved out of the env it would need to be into an interface that can do it without rendering to keep the speedup. With #495, this interface would also need to render the last skipped frame. IMO this would introduce unnecessary complexity as with these desired features the interface would have to interface the environment at a level where it could just be done from within the env itself. (This is a high level view, I have some knowledge of how wrappers work in gymnasium but no knowledge of how ALE is implemented)
With ale-py being transferred to farama foundation, we wish to outline our plan going forward. If you are interested in contributing, please join our discord and put a message in the ale-py channel