The TTCP CAGE Challenges are a series of public challenges instigated to foster the development of autonomous cyber defensive agents. This CAGE Challenge 4 (CC4) returns to a defence industry enterprise environment, and introduces a Multi-Agent Reinforcement Learning (MARL) scenario.
in https://github.com/cage-challenge/cage-challenge-4/blob/main/documentation/docs/pages/tutorials/01_Getting_Started/3_Training_Agents.md the code below uses policy_mapper_func, when the code above creates policy_mapper. Should the example call be to policy_mapper instead?
algo_config = ( PPOConfig() .environment(env="CC4") .multi_agent(policies={ ray_agent: PolicySpec( policy_class=None, observation_space=env.observation_space(cyborg_agent), action_space=env.action_space(cyborg_agent), config={"gamma": 0.85}, ) for cyborg_agent, ray_agent in POLICY_MAP.items() }, policy_mapping_fn=policy_mapper_func ))