HumanCompatibleAI / population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
MIT License
26 stars 2 forks source link

Improved caching support #17

Closed AdamGleave closed 6 years ago

AdamGleave commented 6 years ago
AdamGleave commented 6 years ago

Mostly addressed by commit 11ac51