Closed AdamGleave closed 4 years ago
Merging #21 into master will increase coverage by
2.20%
. The diff coverage is88.60%
.
@@ Coverage Diff @@
## master #21 +/- ##
==========================================
+ Coverage 84.28% 86.48% +2.20%
==========================================
Files 46 54 +8
Lines 3098 3486 +388
==========================================
+ Hits 2611 3015 +404
+ Misses 487 471 -16
Impacted Files | Coverage Δ | |
---|---|---|
...nalysis/reward_figures/gridworld_reward_heatmap.py | 96.22% <ø> (ø) |
|
..._rewards/analysis/reward_figures/plot_pm_reward.py | 87.80% <ø> (ø) |
|
src/evaluating_rewards/experiments/synthetic.py | 83.62% <22.22%> (-5.17%) |
:arrow_down: |
..._rewards/analysis/dissimilarity_heatmaps/config.py | 57.14% <57.14%> (ø) |
|
...ds/analysis/dissimilarity_heatmaps/reward_masks.py | 62.00% <62.00%> (ø) |
|
...s/dissimilarity_heatmaps/plot_gridworld_heatmap.py | 92.30% <88.46%> (ø) |
|
...ewards/analysis/dissimilarity_heatmaps/heatmaps.py | 88.88% <88.88%> (ø) |
|
src/evaluating_rewards/tabular.py | 72.34% <93.54%> (+12.78%) |
:arrow_up: |
...analysis/dissimilarity_heatmaps/transformations.py | 95.65% <95.65%> (ø) |
|
...alysis/dissimilarity_heatmaps/plot_epic_heatmap.py | 96.34% <96.34%> (ø) |
|
... and 25 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update cb185d5...9768c68. Read the comment docs.
Main conceptual change is addition of
canonical_sample.py
, an implementation of the new metric based on canonicalizing the reward for continuous control environments. The tabular version is https://github.com/HumanCompatibleAI/evaluating-rewards/pull/19There are also a number of ancillary changes:
tabular
now that we're handling much larger arrays.plot_gridworld_divergence
toplot_epic_heatmap
and generally refactor the visualization code to be much more modular.plot_canon_heatmap
, the analog ofplot_epic_heatmap
for the new distance.canonical_sample
, and some additional tests intest_scripts
for E2E of both the new scripts and greater coverage of the old heatmap scripts.