issues
search
HumanCompatibleAI
/
population-irl
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
MIT License
26
stars
2
forks
source link
Implement MaxEnt IRL
#2
Closed
AdamGleave
closed
6 years ago