jonathan-laurent / AlphaZero.jl

A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.
https://jonathan-laurent.github.io/AlphaZero.jl/stable/
MIT License
1.24k stars 140 forks source link

Generic POMDP Support #212

Closed nateybear closed 8 months ago

nateybear commented 8 months ago

Hello! Apologies if this has been asked elsewhere.

I remember seeing that a goal for AlphaZero.jl is to support any class of POMDP. Is there any update as to the status of that milestone? As an end user this would be a huge deal for me, though I’m not familiar enough with the ecosystem and research to implement it myself at this point.

Some useful features: multi-agent games, incomplete information, and continuous (or mixed discrete-continuous) action spaces.

Any insight on plans for these things or implementations in other package ecosystems would be much appreciated!

jonathan-laurent commented 8 months ago

Supporting general POMDPs is not a short-term goal for AlphaZero.jl Dealing with multiplayer games and imperfect information requires pretty different algorithms and AlphaZero would be a bad fit in many/most cases. You may find some discussion about this in other issues.