issues
search
HumanCompatibleAI
/
population-irl
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
MIT License
26
stars
2
forks
source link
Implemented Reacher PIRL version
#14
Closed
Discordius
closed
6 years ago