ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.74k stars 750 forks source link

cash (btc) is always zero #17

Closed ziofil closed 6 years ago

ziofil commented 6 years ago

In the new portfolio vector (raw omega) the cash (btc) is always zero. But this cannot be always the best choice: what if all the other currencies are losing value in a period? surely in this case the best choice is to move all the assets back to cash.

In the code, the cash bias is initialized to zero just before the final softmax, so it cannot be anything else than zero. Was this intended?

AhmMontasser commented 6 years ago

among the possible coin choices there is a reversed_USDT coin mapping to BTC_USDT pair, which i believe is exactly what you say.

dexhunter commented 6 years ago

In the new portfolio vector (raw omega) the cash (btc) is always zero.

@ziofil Hi! Since the last activation is softmax, the voting will be adjusted according to the network. It all depends on the network. If the network thinks BTC is valuable it will just lower other assets' voting.

dexhunter commented 6 years ago

An example looks like this:

a = [0,1,2] softmax(a) -> array([ 0.09003057, 0.24472847, 0.66524096])

b = [0,0.1,0.2] softmax(b) -> array([ 0.30060961, 0.33222499, 0.3671654 ])

screenshot_2017-12-13_15-35-41

ziofil commented 6 years ago

@DexHunter Oohhh.. I had missed that 😅 Thank you.

dexhunter commented 6 years ago

@ziofil No problem, I met this problem before. I think we will update the code using Variable later since I found using Variable seems to give better result. But will check with @ZhengyaoJiang first