HumanCompatibleAI / adversarial-policies

Find best-response to a fixed policy in multi-agent RL
MIT License
275 stars 47 forks source link

Question about the victim #46

Closed Jarvis-K closed 4 years ago

Jarvis-K commented 4 years ago

As mentioned in the paper, the victim is fixed during training. But I can not find where the victim's checkpoint is. Can you please help me to point it out?

Jarvis-K commented 4 years ago

I have already found it in the gym_complete package. The issue will be closed then.