kieranjwood / trading-momentum-transformer

This code accompanies the the paper Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture (https://arxiv.org/pdf/2112.08534.pdf).
https://kieranjwood.github.io/publication/momentum-transformer/
MIT License
464 stars 192 forks source link

How do you get ground true? #3

Closed makovez closed 1 year ago

makovez commented 1 year ago

I am studying your paper and i I came across a doubt. In the paper you mentioned "We also use MACD indicators ..., defining the relationship between a short S and long signal L"

So is your a classification problem? What are the labels then? And do you generate labels (long/short) based on MACD signal?

MickyDowns commented 1 year ago

One of the things that makes this work important is that it is predicting the best future position or holding (a continuous variable between -1 and 1) while optimizing against Sharpe Ratio (which captures the interplay between return and volatility). So, it's not a MACD classifier. You could use the direction of the position prediction as a 1/-1 classifier. Further, you could use the magnitude of the prediction as a rough proxy for predicted probability, but you'd be missing the power of the model (and underlying finance concept) which is that scaling the size and direction of many bets based on a risk-aware objective function enables you to compound a slight probabilistic advantage over time.

makovez commented 1 year ago

Hi @MickyDowns, thanks for the answer. It ranges from -1 to 1 like FinrRL environments, but my doubt was how does he learn this?

Transformers alone usually should be trained with labels, so do you just give label data (ground true) as 1 and 0, if up or down? Or does the model automatically learn this like in RL ?

MickyDowns commented 1 year ago

Hey @makovez, it's a good question. While @kieranjwood is predicting next period position, he's calculating "captured returns" (the actual return * his position) and feeding the captured return series into a Sharpe Ratio calculation (which is roughly return / st dev of returns) for evaluation / optimization. He is also volatility re-scaling the training data but let's ignore that complexity for now.

The Transformer in their Momentum Transformer (TFT) implementation (as distinct from their LSTM-only Momentum w/ Reversion implementation), is choosing what sequences to pay attention to in the lookback window. It's learning this based on an embedding of primarily continuous data in the training data set. They have a four-headed attention mechanism which is, presumably, looking over different time frames.

Hope this helps. I would be interested in seeing an RL implementation if you go that way. Presumably there you'll need to bin the actions and, maybe, stack the network to fully implement the reward function.

makovez commented 1 year ago

Ok thanks for the explaination.

Sorry for not replying yesterday. I wanted to first look better at the code and then reply back. So from what I understood from looking at the code and your comment is:

Now, assuming what I previously said is correct, I don't clearly understand what's captured returns and how can it be usefull? Why would you multiply actual returns times predictions of model (position) ?

Also from what I understand the model doesen't take into consideration risk and amount to invest, or does he?

kieranjwood commented 1 year ago

Keeping in mind that this is a portfolio of futures (with volatility scaling):

If the position at time t is +0.5, then holding the position would entail keeping the position as +0.5 at time t+1

The aim is to best size position to maximise Sharpe ratio. If the next-day price movement is 0.1 then the captured return would be 0.5 x 0.1 = 0.05. If the position was a full long, it would have been 1 x 0.1=0.1. The flexibility of allowing the model to take a position anywhere between -1 and +1 allows it to better optimise Sharpe by allowing it to manage volatility (of captured returns) in relation to captured returns.

makovez commented 1 year ago

"would entail keeping the position as +0.5"

If the position was 0.5 does this position refer to both the direction of trade and size of balance to use?

So does 0.5 mean half long, which means go long and use 0.5 of balance?

I don't still quite understand.

aicheung commented 1 year ago

"would entail keeping the position as +0.5"

If the position was 0.5 does this position refer to both the direction of trade and size of balance to use?

So does 0.5 mean half long, which means go long and use 0.5 of balance?

I don't still quite understand.

In my trading system using this code base, I set up a full size limit e.g. 10 contracts for ES. So, when the predicted position size is 1.0 then I will long 10 ES contracts. If it is 0.5 then 5 contracts. Same for short (5 short contacts for -0.5 for example)

makovez commented 1 year ago

@aicheung ok 👍 now start making sense. thanks.

This is similar to finrl environment.

makovez commented 1 year ago

I think it would be great to use more technical indicators. I have built a project with lstm that predicts with 70% precision next candle stock price movement up/down (only considering average price given from (high+low)/2 )

I have tested both with just o,h,c,l,v and with all the technical indicators + feature selection on train period and, without technical indicators it doesen't even go over 50% precision.

If any of you guys wants to have a chat on the topic, my telegram is @sbongown

danbo6 commented 7 months ago

I think it would be great to use more technical indicators. I have built a project with lstm that predicts with 70% precision next candle stock price movement up/down (only considering average price given from (high+low)/2 )

I have tested both with just o,h,c,l,v and with all the technical indicators + feature selection on train period and, without technical indicators it doesen't even go over 50% precision.

If any of you guys wants to have a chat on the topic, my telegram is @sbongown

Hi @makovez I was trying to find you on telegram, but couldn't find your account.

makovez commented 7 months ago

I think it would be great to use more technical indicators. I have built a project with lstm that predicts with 70% precision next candle stock price movement up/down (only considering average price given from (high+low)/2 )

I have tested both with just o,h,c,l,v and with all the technical indicators + feature selection on train period and, without technical indicators it doesen't even go over 50% precision.

If any of you guys wants to have a chat on the topic, my telegram is @sbongown

Hi @makovez I was trying to find you on telegram, but couldn't find your account.

@sbongown

danbo6 commented 6 months ago

I think it would be great to use more technical indicators. I have built a project with lstm that predicts with 70% precision next candle stock price movement up/down (only considering average price given from (high+low)/2 ) I have tested both with just o,h,c,l,v and with all the technical indicators + feature selection on train period and, without technical indicators it doesen't even go over 50% precision. If any of you guys wants to have a chat on the topic, my telegram is @sbongown

Hi @makovez I was trying to find you on telegram, but couldn't find your account.

@sbongown

Hi @makovez, it's the same, I still can not find your account lol. Can you lease add me on telegram @danzb0?