ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.73k stars 748 forks source link

mu might be miscalculated #129

Closed GameHoo closed 4 years ago

GameHoo commented 4 years ago

Thank you very much for your wonderful paper.

I found that there might be an error in calculating mu in the code. in the paper: image

in the code (nnagent.py):

self.__future_omega = (self.__future_price * self.__net.output) / \
                              tf.reduce_sum(self.__future_price * self.__net.output, axis=1)[:, None]
# 
 c = self.__commission_ratio
 w_t = self.__future_omega[:self.__net.input_num - 1]  # rebalanced
 w_t1 = self.__net.output[1:self.__net.input_num]
 mu = 1 - tf.reduce_sum(tf.abs(w_t1[:, 1:] - w_t[:, 1:]), axis=1) * c

I don't think we should use _net.output to calculate future_omega, we should use last_weights . like this:

self.__future_omega = (self.__future_price * last_weights) / \
                              tf.reduce_sum(self.__future_price * last_weights, axis=1)[:, None]
#