HumanCompatibleAI evaluating-rewards issues

HumanCompatibleAI / evaluating-rewards

Library to compare and evaluate reward functions

https://arxiv.org/abs/2006.13900

Apache License 2.0

61 stars 7 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

fixing stable baselines dependency

#56 sibiraja closed 1 year ago
0
Conflicting versions of stable-baselines and imitation

#55 dashpritam opened 1 year ago
3
Switch to free mjkey Dockerfile

#54 AdamGleave opened 2 years ago
1
Custom gym environment support

#53 lucalazzaroni opened 3 years ago
6
Prettify checkpoint figures

#52 AdamGleave closed 3 years ago
1
Improve checkpoint figure and table generation

#51 AdamGleave closed 3 years ago
1
Integrate rollout return with combined_distances and add checkpoint comparison figure plotting

#50 AdamGleave closed 3 years ago
1
Improve PointMass experiment pipeline

#49 AdamGleave closed 3 years ago
1
Pathological visitation distributions

#48 AdamGleave closed 3 years ago
1
Clean up runners

#47 AdamGleave closed 4 years ago
1
Add Ray train experts script

#46 AdamGleave closed 4 years ago
2
Add NPEC support back to table_combined

#45 AdamGleave closed 4 years ago
1
Add `scripts.distances.npec` to perform NPEC comparisons in parallel using Ray

#44 AdamGleave closed 4 years ago
1
Separate distance computation and plotting scripts

#43 AdamGleave closed 4 years ago
1
Remove pickle backward compatibility workarounds

#42 AdamGleave opened 4 years ago
0
Add new visitation distributions for evaluating reward models

#41 AdamGleave closed 4 years ago
0
Authenticate to Docker on CircleCI

#40 AdamGleave closed 4 years ago
2
Could not find a version that satisfies the requirement tensorflow<1.16,>=1.15

#39 nbro opened 4 years ago
9
Upgrade dependencies

#38 AdamGleave closed 4 years ago
1
Update notebook, use separate version file

#37 AdamGleave closed 4 years ago
1
Update README and add EPIC demo notebook

#36 AdamGleave closed 4 years ago
1
Script to generate table of distances with confidence intervals

#35 AdamGleave closed 4 years ago
1
Improve distance heatmap plotting and calculation

#34 AdamGleave closed 4 years ago
1
Compact dissimilarity heatmaps

#33 AdamGleave closed 4 years ago
1
Add new macros and fix normalized NPEC

#32 AdamGleave closed 4 years ago
1
Rewards and heatmaps for gridworlds

#31 AdamGleave closed 4 years ago
1
Compute and report confidence intervals in heatmaps

#30 AdamGleave closed 4 years ago
1
Handle episode termination for potential shaping

#29 AdamGleave closed 4 years ago
1
Convert environments to fixed horizon

#28 AdamGleave closed 4 years ago
2
Add script to compute distance from episode return correlation

#27 AdamGleave closed 4 years ago
1
Upgrade pylint to 2.5

#26 AdamGleave closed 4 years ago
1
Update version of imitation replacing rewards.Batch with data.Transitions

#25 AdamGleave closed 4 years ago
1
Add CANON experiment configuration for PointMaze transfer

#24 AdamGleave closed 4 years ago
1
Make state/action distribution configurable in `plot_canon_heatmap`

#23 AdamGleave closed 4 years ago
1
Explicitly specify discount rate for reward models

#22 AdamGleave closed 4 years ago
1
Deep implementation of new metric

#21 AdamGleave closed 4 years ago
1
Hotfix for flaky tabular tests

#20 AdamGleave closed 4 years ago
2
Add tabular versions of new distance measures

#19 AdamGleave closed 4 years ago
1
Script to create double-blind version of source code

#18 AdamGleave closed 4 years ago
1
Support normalizing divergence heatmaps

#17 AdamGleave closed 4 years ago
1
Use benchmark_environments test code

#16 AdamGleave closed 4 years ago
1
Misc divergence heatmap improvements

#15 AdamGleave closed 4 years ago
1
Interpretability in realistic environments

#14 AdamGleave opened 4 years ago
1
Model comparison: NNLS initialization and alternating minimization

#13 AdamGleave closed 4 years ago
2
Gridworld distance: restrict to physically realistic transitions

#12 AdamGleave closed 4 years ago
1
Divergence of gridworld rewards and reward heatmap improvements

#11 AdamGleave closed 4 years ago
1
Heatmaps of reward for illustrative gridworlds

#10 AdamGleave closed 4 years ago
1
Improve PointMass Reward Heatmap Script

#9 AdamGleave closed 4 years ago
1
Script to plot divergence heatmap

#8 AdamGleave closed 4 years ago
1
Fix CodeCov and pylint

#7 AdamGleave closed 4 years ago
1