ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.74k stars 750 forks source link

Implementing short-sell option by altering data-feed #95

Open ALevitskyy opened 6 years ago

ALevitskyy commented 6 years ago

I am not sure whether this is a right place to ask my question, but the paper seems to use a long-only strategy, which has its drawbacks in bear market. I want to add short-sells, and not being able to change inner-working of the model to support this feature and not willing to debug any side-effects of my changes, I want to achieve that by adding extra instruments to data-feed which would move in opposite directions to originals. So for example for a time series X{t} I would do something like Y{t}=1/X_{t}. Is there any way this approach may have negative side effects, or is this a valid solution?

dexhunter commented 6 years ago

I think shorting is more about actions. Can you explain a bit more on why do you think changing the input data would help the agent's decision?

ALevitskyy commented 6 years ago

Shorting is like buying an asset which moves in opposite direction to original, at least as a first-order approximation. So what I am doing now is just altering Data.db by creating extra asset, rETH which moves in opposite direction to ETH using the formula rETH=1/ETH. So the agent can chose this asset if it wants to short rather than go long on a position. The problem for me for now is that the trained agent assigns some portion of portfolio to long and short at the same time which is a bit counter-intuitive, hence I am asking whether it is a legitimate approach to introduce another "reverse" asset as an option.

davidsblom commented 6 years ago

I guess a different approach to incorporate short trades (and possibly also leverage), is to change the loss function and the output of the network.

I'd say that you have three outputs per asset: the first output is a boolean (or zero/one) indicating long or short position, a second value which is the portfolio weight for the long position, and a third value for the short position. You would need to write a custom softmax function. The loss function will only add the contribution of either the short or the long trade.

dexhunter commented 6 years ago

Yes, what David wrote is a good method. Besides, you can change the activation function to allow negative output.

On Fri, Jul 6, 2018, 1:52 PM David Blom notifications@github.com wrote:

I guess a different approach to incorporate short trades (and possibly also leverage), is to change the loss function and the output of the network.

I'd say that you have three outputs per asset: the first output is a boolean (or zero/one) indicating long or short position, a second value which is the portfolio weight for the long position, and a third value for the short position. You would need to write a custom softmax function. The loss function will only add the contribution of either the short or the long trade.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ZhengyaoJiang/PGPortfolio/issues/95#issuecomment-402933416, or mute the thread https://github.com/notifications/unsubscribe-auth/AGnAVk2oXxysKWssAeG21sQmwyJKGM2Qks5uDvsdgaJpZM4VBvTW .

dexhunter commented 6 years ago

Oh, I forgot to mention you can use the same variable to control long/short position. I don't quite understand why there needs to be two separate variables for long/short, if there are some reasoning behind, please share it. Sorry for additional email since it's hard to edit the comment on my mobile phone.

On Fri, Jul 6, 2018, 2:13 PM Dex Hunter dixingxu@gmail.com wrote:

Yes, what David wrote is a good method. Besides, you can change the activation function to allow negative output.

On Fri, Jul 6, 2018, 1:52 PM David Blom notifications@github.com wrote:

I guess a different approach to incorporate short trades (and possibly also leverage), is to change the loss function and the output of the network.

I'd say that you have three outputs per asset: the first output is a boolean (or zero/one) indicating long or short position, a second value which is the portfolio weight for the long position, and a third value for the short position. You would need to write a custom softmax function. The loss function will only add the contribution of either the short or the long trade.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ZhengyaoJiang/PGPortfolio/issues/95#issuecomment-402933416, or mute the thread https://github.com/notifications/unsubscribe-auth/AGnAVk2oXxysKWssAeG21sQmwyJKGM2Qks5uDvsdgaJpZM4VBvTW .

davidsblom commented 6 years ago

Yeah, you could also use the same variable to control long/short position. I think you should use the tanh activation function, and then an L1 normalization step in the output layer to allow for short trades.

Do you still need to change the loss function? I don't think so, right?

dexhunter commented 6 years ago

Do you still need to change the loss function? I don't think so, right?

I think you still need to change the loss function. Besides tanh, you can try other activation functions ( full list at wikipedia )

kenuku commented 4 years ago

This is wrong approach to calculate rETH=1/ETH (or to be precise rETHBTC=1/ETHBTC) 1/ETHBTC means price BTC in ETH and equivalently BTCETH 1/ETHBTC is NOT equivalent shorting price of ETHBTC

jgerlach93 commented 3 years ago

hey, dont know if anyone is still interested in this topic. but shouldnt it be possible to simply replace the softmax layer at the end of the network by a different operation on the weight vector? i am not sure about what values the voting layer returns. are they positive only and then normalized by the softmax, or can they be negative as well? because if the latter is true, one could simply take the "raw" output vector from the voting layer, containing positive and negative values, and set the final layer as w' = w/sum(abs(w)). this will return a weight vector w' with positive and negative values for long and short, respectively, whose absolute values sum(abs(w'))=1 always. then, the network should learn to make use of negative expected growth, as well, right? or is my understanding of the voting layer wrong, and it is more a "counting" of positive-valued votes from the individual constituents of the ensemble?

trevorprater commented 3 years ago

I think the above comment makes a lot of sense, assuming the softmax layer is not performing a necessary function that was omitted above. This would be quite useful, receiving the weighted vector. Ideal for convex portfolio optimization, factor neutrality, and proper hedging, instead of, say, 1/USDT being the only hedge.

N.B. This repo is now quite old, all things considered. If I remember correctly (as being the third star), this came out long before futures markets were relevant or mature in the crypto space, leaving 1/USDT to be the only available, given that all pairs, at the time, used BTC as the underlier. In any case, yes, non-positive alpha signals would likely improve my trading pipeline.