HumanCompatibleAI / adversarial-policies

Find best-response to a fixed policy in multi-agent RL
MIT License
275 stars 47 forks source link

Lookback #10

Closed kantneel closed 5 years ago

kantneel commented 5 years ago

This branch has features related to running experiments in which we spawn "lookback" environments and compare rollouts between these environments and the ones where we are training our own policy.

The major additions are in modelfree.common.lookback, but there are also changes related to Mujoco wrappers and data structures in aprl.common.mujoco.

kantneel commented 5 years ago

This branch relies on a not-yet-integrated branch of CHAI's baselines repo. The only change in that branch is the addition of __getattr__ to VecEnvWrapper. The diff between adv-policies and that branch is all here: https://github.com/HumanCompatibleAI/baselines/commit/940546d3aad57d89e632e4578d88320920b462e4

codecov[bot] commented 5 years ago

Codecov Report

Merging #10 into master will increase coverage by 1.45%. The diff coverage is 82.89%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #10      +/-   ##
==========================================
+ Coverage   53.48%   54.94%   +1.45%     
==========================================
  Files          48       50       +2     
  Lines        4042     4288     +246     
==========================================
+ Hits         2162     2356     +194     
- Misses       1880     1932      +52
Flag Coverage Δ
#aprl 12.38% <3.83%> (-0.78%) :arrow_down:
#modelfree 46.99% <81.41%> (+2.21%) :arrow_up:
Impacted Files Coverage Δ
src/modelfree/common/transparent.py 75% <ø> (-7.36%) :arrow_down:
src/aprl/common/multi_monitor.py 94.73% <100%> (+0.29%) :arrow_up:
src/modelfree/score_agent.py 83.43% <100%> (ø) :arrow_up:
src/aprl/agents/monte_carlo.py 93.57% <100%> (+1.47%) :arrow_up:
src/modelfree/common/utils.py 88.44% <100%> (+0.99%) :arrow_up:
src/aprl/envs/multi_agent.py 76.33% <100%> (-2.21%) :arrow_down:
src/modelfree/common/policy_loader.py 84.04% <100%> (ø) :arrow_up:
src/aprl/agents/__init__.py 100% <100%> (ø) :arrow_up:
src/aprl/common/mujoco.py 80% <64.28%> (-20%) :arrow_down:
src/modelfree/training/lookback.py 77.94% <77.94%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c93e3c5...d65458d. Read the comment docs.