abhisheknaik96 differential-value-iteration issues

abhisheknaik96 / differential-value-iteration

Experiments in creating the ultimate average-reward planning algorithm

Apache License 2.0

0 stars 2 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Removing old notebook and improving README

#54 btanner closed 2 years ago
0
Merging Brian's latest version with main repo

#53 btanner closed 2 years ago
0
Project CheckPointing Merge

#52 btanner closed 2 years ago
0
micro.py should say if each MDP/MRP is UniChain or MultiChain

#51 btanner opened 2 years ago
0
MDVI counterexample and exploration

#50 btanner closed 2 years ago
1
Minor updates in tolerance checking and specific convergence case

#49 btanner closed 2 years ago
0
Checking evaluation convergence on our MDPs.

#48 btanner closed 2 years ago
0
Measures policies are defined intervals

#47 btanner closed 2 years ago
0
This appears to be an empty PR, but I'd prefer if my repo didn't say it was 6 commits ahead.

#46 btanner closed 2 years ago
0
Random policy, empirically evaluate all start states

#45 btanner closed 2 years ago
0
Some cleanup of MDVI

#44 btanner closed 2 years ago
0
Adds MM1 Queue

#43 btanner closed 2 years ago
1
Several fixes and cleanups

#42 btanner closed 2 years ago
1
Golden predictions

#41 btanner closed 3 years ago
1
Adds dependency for QuantEcon and methods to extract a Markov Chains

#40 btanner closed 3 years ago
0
Quick tidy of test MDPS in policy test.

#39 btanner closed 3 years ago
0
Adds quick port of MDVI Control 2.

#38 btanner closed 3 years ago
3
Added simple benchmark for control problems.

#37 btanner closed 3 years ago
0
Compare policies

#36 btanner closed 3 years ago
0
added a flag to save final estimates

#35 abhisheknaik96 closed 3 years ago
2
Golden test values?

#34 btanner opened 3 years ago
7
64bit experiments

#33 btanner closed 3 years ago
1
Updated mdvi sync algorithm to correct bug I introduced earlier.

#32 btanner closed 3 years ago
0
Made choice of MRP to test possible in experiment and started refactoring flags

#31 btanner closed 3 years ago
0
Adds ability to use 64bit transition and reward matrices

#30 btanner closed 3 years ago
0
Test for MDVI should probably be a multichain problem

#29 btanner closed 3 years ago
1
Adding pointers to current section of draft paper.

#28 btanner closed 3 years ago
1
Updated MRP1 to match the book and updated comments.

#27 btanner closed 3 years ago
1
Small fixes to reduce lint errors.

#26 btanner closed 3 years ago
0
Review Prediction Algorithm Implementations vs Latex Descriptions

#25 btanner closed 3 years ago
5
Convert MRP1 (and others?) to match the RL Book exactly.

#24 btanner closed 3 years ago
3
Convert other algs (follows PR/20)

#23 btanner closed 3 years ago
0
Update README.md

#22 btanner closed 3 years ago
1
Creates copies of all evaluation algorithms to the new strategy

#21 btanner closed 3 years ago
0
Sample experiment for evaluation case that currently only uses RVI.

#20 btanner closed 3 years ago
0
Adds a separate implementation of Evaluation Algorithm with RVI.

#19 btanner closed 3 years ago
2
Quick test of a change in a branch.

#18 btanner closed 3 years ago
0
Removes unused RNG keys from garet.

#17 btanner closed 3 years ago
0
Async state order?

#16 btanner opened 3 years ago
2
Detect uni/multi chain

#15 btanner closed 3 years ago
2
Fixes typos and removes a commented debug print.

#14 btanner closed 3 years ago
0
Adds support for command-line flags and garet environment.

#13 btanner closed 3 years ago
1
Determine if MDP rewards are determined by (s, a) or (s, a, s')

#12 btanner closed 3 years ago
3
The condition of convergence needs to be revised for async algorithms.

#11 yiwan-rl closed 3 years ago
4
Splits environment.py into micro.py and structure.py

#10 btanner closed 3 years ago
0
Change MRP (MarkovRewardProcess) and MDP (MarkovDecisionProcess) to use dataclasses

#9 btanner closed 3 years ago
1
Questions about semantics of MDP

#8 btanner closed 3 years ago
2
Removes plots, adds them to gitignore.

#7 btanner closed 3 years ago
0
Adds JAX requirements and a test to see if JAX is working.

#6 btanner closed 3 years ago
0
Adds dependency for matplotlib.

#5 btanner closed 3 years ago
0