abhisheknaik96 / differential-value-iteration

Experiments in creating the ultimate average-reward planning algorithm
Apache License 2.0
0 stars 2 forks source link

Adds dependency for QuantEcon and methods to extract a Markov Chains #40

Closed btanner closed 3 years ago

btanner commented 3 years ago

Directly from MRP or from MDP given a policy.