HumanCompatibleAI / evaluating-rewards

Library to compare and evaluate reward functions
https://arxiv.org/abs/2006.13900
Apache License 2.0
61 stars 7 forks source link

Gridworld distance: restrict to physically realistic transitions #12

Closed AdamGleave closed 4 years ago

AdamGleave commented 4 years ago

Compute gridworld distance over $\mathcal{D}_u$: the uniform transition dataset from sampling $(s,a)$ uniformly at random and deterministically computing $s'$. This is consistent with the methodology in PointMaze.

Also some tweaks to figure label, sizing.

codecov[bot] commented 4 years ago

Codecov Report

Merging #12 into master will increase coverage by 0.08%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #12      +/-   ##
==========================================
+ Coverage   84.17%   84.25%   +0.08%     
==========================================
  Files          45       45              
  Lines        2799     2814      +15     
==========================================
+ Hits         2356     2371      +15     
  Misses        443      443
Impacted Files Coverage Δ
src/evaluating_rewards/analysis/stylesheets.py 71.42% <ø> (ø) :arrow_up:
...ting_rewards/analysis/plot_gridworld_divergence.py 95.12% <100%> (+0.6%) :arrow_up:
...c/evaluating_rewards/analysis/gridworld_heatmap.py 96.22% <100%> (+0.09%) :arrow_up:
src/evaluating_rewards/analysis/visualize.py 83.85% <100%> (+0.16%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4c3db3a...661ee47. Read the comment docs.