HumanCompatibleAI / evaluating-rewards

Library to compare and evaluate reward functions
https://arxiv.org/abs/2006.13900
Apache License 2.0
61 stars 7 forks source link

Deep implementation of new metric #21

Closed AdamGleave closed 4 years ago

AdamGleave commented 4 years ago

Main conceptual change is addition of canonical_sample.py, an implementation of the new metric based on canonicalizing the reward for continuous control environments. The tabular version is https://github.com/HumanCompatibleAI/evaluating-rewards/pull/19

There are also a number of ancillary changes:

codecov[bot] commented 4 years ago

Codecov Report

Merging #21 into master will increase coverage by 2.20%. The diff coverage is 88.60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #21      +/-   ##
==========================================
+ Coverage   84.28%   86.48%   +2.20%     
==========================================
  Files          46       54       +8     
  Lines        3098     3486     +388     
==========================================
+ Hits         2611     3015     +404     
+ Misses        487      471      -16     
Impacted Files Coverage Δ
...nalysis/reward_figures/gridworld_reward_heatmap.py 96.22% <ø> (ø)
..._rewards/analysis/reward_figures/plot_pm_reward.py 87.80% <ø> (ø)
src/evaluating_rewards/experiments/synthetic.py 83.62% <22.22%> (-5.17%) :arrow_down:
..._rewards/analysis/dissimilarity_heatmaps/config.py 57.14% <57.14%> (ø)
...ds/analysis/dissimilarity_heatmaps/reward_masks.py 62.00% <62.00%> (ø)
...s/dissimilarity_heatmaps/plot_gridworld_heatmap.py 92.30% <88.46%> (ø)
...ewards/analysis/dissimilarity_heatmaps/heatmaps.py 88.88% <88.88%> (ø)
src/evaluating_rewards/tabular.py 72.34% <93.54%> (+12.78%) :arrow_up:
...analysis/dissimilarity_heatmaps/transformations.py 95.65% <95.65%> (ø)
...alysis/dissimilarity_heatmaps/plot_epic_heatmap.py 96.34% <96.34%> (ø)
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cb185d5...9768c68. Read the comment docs.