Closed AdamGleave closed 4 years ago
Merging #4 into master will increase coverage by
3.16%
. The diff coverage is34.48%
.
@@ Coverage Diff @@
## master #4 +/- ##
==========================================
+ Coverage 68.2% 71.36% +3.16%
==========================================
Files 39 39
Lines 2365 2375 +10
==========================================
+ Hits 1613 1695 +82
+ Misses 752 680 -72
Impacted Files | Coverage Δ | |
---|---|---|
src/evaluating_rewards/policies.py | 77.19% <ø> (-0.4%) |
:arrow_down: |
src/evaluating_rewards/scripts/eval_policy.py | 0% <ø> (ø) |
:arrow_up: |
tests/common.py | 100% <100%> (ø) |
:arrow_up: |
src/evaluating_rewards/__init__.py | 100% <100%> (ø) |
:arrow_up: |
src/evaluating_rewards/experiments/visualize.py | 22.06% <9.52%> (+4.21%) |
:arrow_up: |
src/evaluating_rewards/envs/point_mass.py | 83.23% <0%> (+1.19%) |
:arrow_up: |
.../evaluating_rewards/scripts/visualize_pm_reward.py | 84.41% <0%> (+53.24%) |
:arrow_up: |
...luating_rewards/experiments/point_mass_analysis.py | 88.63% <0%> (+59.09%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update cefc21e...24e6df8. Read the comment docs.
This improved stability but still significant room for improvement. Merging but will continue this theme in another PR.
Update to use new PolicyMixture distribution for preferences, regress and model comparison. This should result in more robust reward estimates and reward model similarity metrics.