HumanCompatibleAI / adversarial-policies

Find best-response to a fixed policy in multi-agent RL
MIT License
275 stars 47 forks source link