HumanCompatibleAI / population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
MIT License
26 stars 2 forks source link

Paper link #19

Closed gunshi closed 5 years ago

gunshi commented 5 years ago

Hey Adam, is there a paper associated with this repo? That would help me understand the algorithm/problem better. Thanks! Gunshi

AdamGleave commented 5 years ago

Yes: https://arxiv.org/abs/1805.08882