ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.74k stars 750 forks source link

y(t) = close(t)/close(t-1) or y(t) = close(t)/open(t)? #1

Closed wassname closed 6 years ago

wassname commented 6 years ago

Thanks for putting the code up. Can I ask for a minor clarification?

In the paper it said you have y(t) = close(t)/open(t) but in the code there is y(t) = close(t)/close(t-1)

The paper also divides the batch X by the open(t) (X=M/open(t)) but in the code it doesn't look like it's divided/scaled (X=M)?

Heres the code I'm talking about. I think the shape of M is (batch, features, coins, times) where features are ["close", "high", "low", "open"].

Have these things changed since the paper or am I misunderstanding something?

Thanks!

ZhengyaoJiang commented 6 years ago

Hi, nice to see you again.

In the paper it said you have y(t) = close(t)/open(t) but in the code there is y(t) = close(t)/close(t-1)

I think in the article, what we want to express is y(t) = close(t)/close(t-1).

The paper also divides the batch X by the open(t) (X=M/open(t)) but in the code it doesn't look like it's divided/scaled (X=M)?

It's done in the tensorflow code(network.py) for saving memory.

wassname commented 6 years ago

Thanks! Cheers for clarifying, that makes sense.

Both I and @goolulusaurs (who has been working with me) misinterpreted that part of the paper so it might be worth considering a rephrase when you do version 2 of the arxiv paper.

ZhengyaoJiang commented 6 years ago

Both I and @goolulusaurs (who has been working with me) misinterpreted that part of the paper so it might be worth considering a rephrase when you do version 2 of the arxiv paper.

Yes, I agree.

kumkee commented 6 years ago

In the paper it said you have y(t) = close(t)/open(t) but in the code there is y(t) = close(t)/close(t-1)

Quoting from our paper:

For continuous markets, elements of v_t are the opening prices for Period t + 1 as well as the closing prices for Period t.

That means we assumed open(t) = close(t-1) there.

wassname commented 6 years ago

Ah didn't see that, thanks.