werner-duvaud / muzero-general

MuZero
https://github.com/werner-duvaud/muzero-general/wiki/MuZero-Documentation
MIT License
2.46k stars 606 forks source link

Question: Does muzero-general support 2 player games with simultaneous action selection? #207

Open moscoso opened 1 year ago

moscoso commented 1 year ago

I am trying to use muzero-general for Race for the Galaxy. In that game, each player makes moves simultaneously (at the same time).

As an ML noobie, I ask how does one implement the Policy network and MCTS to take into account how the other person's choice affects the action step?

JohnPPP commented 1 year ago

Hi,

Very interesting!

I love Race for the Galaxy!

Not sure if Muzero can handle the uncertanty of the cards drawn... since a set of actions will have a different impact on the board. I would try with a simple card game, just to be sure. Can Muzero play blackjack?

Back to your question, hide the info from the env_observable_stack until both players have made the decision. Its not simultaneous, but its hidden. End result should be the same.

Best of luck, João

A terça, 13/09/2022, 17:18, Chris Moscoso @.***> escreveu:

I am trying to use muzero-general for Race for the Galaxy. In that game, each player makes moves simultaneously (at the same time).

As an ML noobie, I ask how does one implement the Policy network and MCTS to take into account how the other person's choice affects the action step?

— Reply to this email directly, view it on GitHub https://github.com/werner-duvaud/muzero-general/issues/207, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPAYRPS5GNVQVFXILTSDUDV6CSOHANCNFSM6AAAAAAQLTJ67E . You are receiving this because you are subscribed to this thread.Message ID: @.***>

moscoso commented 1 year ago

Hi, JohnPPP, glad to hear you love that game too.

I know that cards in hand and deck are hidden for each player and that adds a stochasticity component to the equation. However, I wouldn't mind making those be revealed for the sake of a first iteration of muZero implementation. I also have some grasp of where to begin tackling the chance of draws, thanks to this paper: https://openreview.net/forum?id=X6D9bAHhBQ1

What I am more focused and interested on, is how to implement and/or express the action / policy network when it comes to an AI making a choice along side the opponent.

I have no idea where to begin my research or prototype there. Any ideas?

JohnPPP commented 1 year ago

Hi,

What if the first action is not shown at all. When the second player acts, both consequences are shown. This way, all remains the same. Looks like the first easy test.

Btw, I could never train anything with this implementation... perhaps I doing something wrong, but it just plays bad at everything... even tic tac toe.

Could you get any results?

All the best, Joao

A terça, 13/09/2022, 18:49, Chris Moscoso @.***> escreveu:

Hi, JohnPPP, glad to hear you love that game too.

I know that cards in hand and deck are hidden for each player and that adds a stochasticity component to the equation. However, I wouldn't mind making those be revealed for the sake of a first iteration of muZero implementation. I also have some grasp of where to begin tackling the chance of draws.

What I am more focused and interested on, is how to implement and/or express the action / policy network when it comes to an AI making a choice along side the opponent.

I have no idea where to begin my research or prototype there. Any ideas?

— Reply to this email directly, view it on GitHub https://github.com/werner-duvaud/muzero-general/issues/207#issuecomment-1245751515, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPAYRMRBQFRAFGJNYHVTC3V6C5CLANCNFSM6AAAAAAQLTJ67E . You are receiving this because you commented.Message ID: @.***>