sawcordwell / pymdptoolbox

Markov Decision Process (MDP) Toolbox for Python
BSD 3-Clause "New" or "Revised" License
526 stars 251 forks source link

MDP solving with LP #17

Open silgon opened 9 years ago

silgon commented 9 years ago
silgon commented 9 years ago

I didn't even have time to change the test_LP.py file author. If you add it please put @silgon in the file ;). I also have a question on how you created the files. Because it says Created on $date. Were they generated automatically? if so, how?

silgon commented 9 years ago

I know it's a pull request, but I also noticed something. You should also change this:

f = self._cvxmat(_np.ones((self.S, 1)))
h = _np.array(self.R).reshape(self.S * self.A, 1)
h = self._cvxmat(h, tc='d')
M = _np.zeros((self.A * self.S, self.S))
for aa in range(self.A):
   pos = (aa + 1) * self.S
   M[(pos - self.S):pos, :] = (
        self.discount * self.P[aa] - _sp.eye(self.S, self.S))
M = self._cvxmat(M)

to this:

f = self._cvxmat(_np.ones((self.S, 1)))
M = _np.zeros((self.A * self.S, self.S))
h = np.zeros((self.A * self.S,))
for aa in range(self.A):
    pos = (aa + 1) * self.S
    M[(pos - self.S):pos, :] = (
        self.discount * self.P[aa] - _sp.eye(self.S, self.S))
    # if R has dimensions (S, A)
    h[(pos - S):pos] = np.dot(P[aa], R[:, aa])
    # if R has dimensions (S,)
    # h[(pos - S):pos] = np.dot(P[aa], R)
M = self._cvxmat(M)
h = self._cvxmat(h, tc='d')

Look at the commented code, it should be probably a function that takes into account the dimension of R. Maybe I'll take some time later on to change it and then I do another pull request.