As developers, we want this so the policy is trained to compete with ever better getting opponents. This way it will learn to handle more complex strategies of opponents and not exploit a simple never changing opponent.
Why do we need to do this?
This story is needed to handle more complex strategies of opponents and not exploit a simple never changing opponent.
Tasks for this user story
[x] Get environment to werk with super suit and petting zoo and Stable baselines
[x] Create unit test to create a multi agent environment
[x] Make the environment use PettingZoo
[x] Let stable baselines3 train on this multi agent environment
Acceptance criteria
This user story is done when policies get trained in a multi agent environment.
Description
As developers, we want this so the policy is trained to compete with ever better getting opponents. This way it will learn to handle more complex strategies of opponents and not exploit a simple never changing opponent.
Why do we need to do this?
This story is needed to handle more complex strategies of opponents and not exploit a simple never changing opponent.
Tasks for this user story
Acceptance criteria
This user story is done when policies get trained in a multi agent environment.