jmbejara / comp-econ-sp19

Main Course Repository for Computational Methods in Economics (Econ 21410, Spring 2019)
48 stars 26 forks source link

Difference between value function and policy function #65

Open rhuselid opened 5 years ago

rhuselid commented 5 years ago

I am working on the part of the homework from the textbook and am running into a bit of a conceptual issue. What exactly is the difference between the value function and the policy function.

The way I have them currently constructed I get a shape mismatch because my value matrix is NxN and my policy matrix is NxT+1. I am not sure how the value matrix should be reduced in a way that is logically district from what the policy matrix already is. Relevant code below:

maximized = []
    for i in range(N):
        maximized.append([])
    for i in range(T + 1, -1, -1):
        Vt = []
        for cake_t in w:
            # each row is a wt
            row = []
            for cake_t1 in w:
                # each col is a wt+1
                val = cake_t1 - cake_t
                if val >= 0:
                    row.append(u(val))
                else:
                    row.append(-10000000)
            Vt.append(row)
        intermediary = []
        for wt in Vt:
            best_wt1 = max(wt)
            intermediary.append([best_wt1])
        maximized = np.hstack((maximized, intermediary))
jmbejara commented 5 years ago

The value function is the discounted future value, given the state, of acting optimally. The policy function is the optimal policy to choose, given the state. These functions also depend on the time period, since this first problem has a finite number of time periods.

There are T time periods. There is no continuation value after time T. The way that the question is constructed, they want you to construct a matrix so that column T is the value in the last time period and column T +1 represents the value at time T+1---which should be uniformly zero.