wassname / rl-portfolio-management

Attempting to replicate "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem" https://arxiv.org/abs/1706.10059 (and an openai gym environment)
544 stars 179 forks source link

potential bug in _step #10

Closed ziofil closed 6 years ago

ziofil commented 6 years ago

Why (1-mu1)? It should be like eq. (2) but with the mu factor, no?

https://github.com/wassname/rl-portfolio-management/blob/60b9ef789140a15c476e7e8d246fbf31c6d416b3/rl_portfolio_management/environments/portfolio.py#L154

wassname commented 6 years ago

Equation 2 is ignoring transaction cost, but we don't want to do that. This is equation 11 and 12 merged. (1-mu) is (1-transaction_cost) so it's the fraction of remain portfolio value after transaction costs are subtracted. Does that make sense?

ziofil commented 6 years ago

It's possible that we are calling different things with the same name, hence the confusion. I'm saying that I think the equation should be p1 = p0 * mu1 * np.dot(y1, w0), which is like eq. (2) but with mu1 (essentially, the argument of the logarithm in eq. (10)).

mu is the transaction remainder factor and has a value of about 0.97 ~ 0.99, depending on how much you buy/sell, when you write (1-mu1) you are left with 0.03 ~ 0.01. Perhaps that could explain the very low return that you have been observing.

wassname commented 6 years ago

Looks like I called $c_1$ from the paper (eq12) mu1, hence the confusion. But it does have a value of ~0.0003, making things work out (just checked). I just changed the variable name to c1 to be clearer.

Thanks for taking a look.

ziofil commented 6 years ago

Ah ok! All good 😄