SecurityGames / oef

Repository for the submission of NeurIPS Datasets and Benchmarks Track 2022.
3 stars 0 forks source link

player_id: -4 ??? leduc with 3 players #2

Closed Waiting-TT closed 2 years ago

Waiting-TT commented 2 years ago

mb_psro_rl_oracle can fit the poker with more than 3,4, 5 player????

SecurityGames commented 2 years ago

Yes. Because PSRO with Alpha-rank algorithm as meta solver can handle games with more than two players, model-based PSRO can also solve games with multiple players.

TzuRen commented 2 years ago

run_mb_psro As for leduc with 3 player, there will output one error with player_id to be -4 !!!

SecurityGames commented 2 years ago

player_id = -4 is used to mark the end state. Please refer to network/env_model.py file. image

Waiting-TT commented 2 years ago

I0726 21:31:12.822597 139731719901376 rl_environment.py:190] Using game instance: leduc_poker Game : leduc_poker Seed: 1 Using 1000 sims per entry. Rectifier : Perturbating oracle outputs : False Sampling from marginals : True Using <function alpharank_strategy at 0x7f13fafb3510> as strategy method. Using <function filter_function_factory..filter_policies at 0x7f143bbf6840> as training strategy selector. Iteration : 0 Time so far: 4.291534423828125e-05 Traceback (most recent call last): File "/home/ai/oef-main/oef_psro/run_mb_psro_leduc3.py", line 230, in app.run(main) File "/home/ai/apps/miniconda3/envs/marlgraph36/lib/python3.6/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/ai/apps/miniconda3/envs/marlgraph36/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/home/ai/oef-main/oef_psro/run_mb_psro_leduc3.py", line 210, in main g_psro_solver = gpsro_looper(env, oracle, agents) File "/home/ai/oef-main/oef_psro/run_mb_psro_leduc3.py", line 159, in gpsro_looper g_psro_solver.iteration() File "/home/ai/apps/miniconda3/envs/marlgraph36/lib/python3.6/site-packages/open_spiel/python/algorithms/psro_v2/abstract_meta_trainer.py", line 192, in iteration self.update_agents() # Generate new, Best Response agents via oracle. File "/home/ai/oef-main/oef_psro/mb_psro.py", line 163, in update_agents using_joint_strategies=self._rectify_training or not self.sample_from_marginals) File "/home/ai/oef-main/oef_psro/mb_psro_rl_oracle.py", line 136, in call self._rollout(agents) File "/home/ai/oef-main/oef_psro/mb_psro_rl_oracle.py", line 107, in _rollout self.sample_episode(None, agents, is_evaluation=False) File "/home/ai/oef-main/oef_psro/mb_psro_rl_oracle.py", line 57, in sample_episode agent_output = agents[player_id].step(time_step, is_evaluation=is_evaluation) IndexError: list index out of range

Process finished with exit code 1

lcskxj commented 2 years ago

Do you know the player_id and the size of agents when the error occurs?

Waiting-TT commented 2 years ago

psro for leduc with 3 players, player_id changes to be -4....... it seems that the run_mb_psro.py don't fit the leduc with 3 player....

SecurityGames commented 2 years ago

It may be caused by the additional while loop condition. See the red frame. Just delete this, it can work. image

SecurityGames commented 2 years ago

The change has been committed. Sorry for the inconvenience.