Closed AdamGleave closed 3 years ago
Merging #43 into master will decrease coverage by
1.34%
. The diff coverage is89.70%
.
@@ Coverage Diff @@
## master #43 +/- ##
==========================================
- Coverage 86.59% 85.25% -1.35%
==========================================
Files 63 65 +2
Lines 4304 4231 -73
==========================================
- Hits 3727 3607 -120
- Misses 577 624 +47
Impacted Files | Coverage Δ | |
---|---|---|
src/evaluating_rewards/__init__.py | 100.00% <ø> (ø) |
|
.../evaluating_rewards/analysis/distances/__init__.py | 100.00% <ø> (ø) |
|
...luating_rewards/analysis/distances/reward_masks.py | 62.00% <ø> (ø) |
|
...ting_rewards/analysis/distances/transformations.py | 93.47% <ø> (ø) |
|
...ting_rewards/analysis/reward_figures/point_mass.py | 79.10% <ø> (ø) |
|
src/evaluating_rewards/distances/npec.py | 98.36% <ø> (ø) |
|
src/evaluating_rewards/policies/monte_carlo.py | 0.00% <ø> (ø) |
|
src/evaluating_rewards/scripts/script_utils.py | 53.33% <0.00%> (ø) |
|
...wards/analysis/distances/plot_gridworld_heatmap.py | 89.56% <25.00%> (ø) |
|
...luating_rewards/analysis/distances/plot_heatmap.py | 76.69% <76.69%> (ø) |
|
... and 23 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update ac6b9b4...a3a894c. Read the comment docs.
Currently,
plot_{epic,erc}_heatmap
both compute EPIC and ERC distance (respectively) and then plot the results. They save the raw results, and can be used in a mode that loads previously recorded results rather than plotting, but this is not the default mode and there is no script to compute EPIC and ERC distance without producing plots as a side-effect.This design is clunky, and makes it awkward to implement things like
table_combined
that tabulate (rather than plot heatmaps) of results, from multiple distance methods.This PR introduces a new script
plot_heatmap
which plots a heatmap from previously saved results. The remaining functionality is moved todistances.epic
anddistances.erc
, which compute the distance between all pairs from a set of rewards, and save aggregated results.One design note: we save aggregated results (e.g. mean, lower CI, upper CI) rather than raw results since the method of aggregation varies between distances. For EPIC (and NPEC) it is point estimates of different seeds. But for ERC we can bootstrap directly on the returns.
There's also a variety of more minor tidying.