ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.73k stars 748 forks source link

Computing Omega in code vs equation #99

Closed d3sm0 closed 6 years ago

d3sm0 commented 6 years ago

Hi ! Great work here, congrats, both for the code and the results!

I was looking at the eq 7 of the article and at the code around line 22 of the NNAgent class, and while on the equation you are using the weights at the at t-1 in the code you are using the output of the network given the last data.

I believe there is something that i'm missing, which one is the correct weight?

Thank you!

dexhunter commented 6 years ago

@d3sm0 Hi! I guess you are considering whether there is look-ahead bias in the code. And there is not. https://github.com/ZhengyaoJiang/PGPortfolio/blob/a2daa72d5ed62acad0a947f0bc79970c6356514b/pgportfolio/learn/nnagent.py#L25

Suppose the time at current step is t, then the output of the network is w{t}, the future price is y{t+1} and relatively you could think as w{t-1} and y{t} as if you are calculating for the previous period.

d3sm0 commented 6 years ago

Hi @DexHunter

No i don't think that there is a look-ahead bias, because you already computed the weight of the portfolio for the next iteration. Just to check if I understood correctly, if w[t] is the next candidate weight of the portfolio, then omega and mu are given by:

omega = y * w[t] / y.dot(w[t])
mu  = 1 - (sum(abs(omega[t] - w[t])) * c

But in this case the weights of the portfolio of the previous iteration are not directly used right?

dexhunter commented 6 years ago

But in this case the weights of the portfolio of the previous iteration are not directly used right?

No. The previous portfolio weights are used in PVM. I think you are referring to rebalancing? If so, you can check Figure 1 in the paper for clarification.

d3sm0 commented 6 years ago

Great thanks for the clarity!

cgebe commented 5 years ago

The calculation of the consumption vector is the following:

        c = self.__commission_ratio
        w_t = self.__future_omega[:batch_size-1]  # rebalanced
        w_t1 = self.__net.output[1:batch_size]
        mu = 1 - tf.reduce_sum(tf.abs(w_t1[:, 1:]-w_t[:, 1:]), axis=1)*c

Suppose we have a batch_size of 1, then our consumption vector is empty. Surely, we use the rebalanced values already for the calculation however, the first prediction of each batch will not be included in the consumption. We would have to concatenate the last value of previous_w to the front of the omega, to also include the first prediction consumption. Also remove the starting index of the output tensor.

        w_t = tf.concat(previous_w[-1], future_omega[:batch_size-1]], axis=0) 
        w_t1 = self.__net.output[:batch_size]

Here, we prepend 1.0 as our consumption for the first prediction:

self.__pv_vector = tf.reduce_sum(self.__net.output * self.__future_price, reduction_indices=[1]) * (tf.concat([tf.ones(1), self.__pure_pc()], axis=0))

Comments?