hgeorgako / rfortraders

Quantitative Trading with R
MIT License
189 stars 132 forks source link

On page 131, the chunk about "#Generate sell and buy signals" #15

Open tsungwu opened 7 years ago

tsungwu commented 7 years ago

On page 131, the chunk about "#Generate sell and buy signals"

buys <- ifelse(data_out$spread > threshold,1,0); sells <- ifelse(data_out$spread < -threshold,-1,0)

Shouldn't it be sells <- ifelse(data_out$spread > threshold,-1,0); buys <- ifelse(data_out$spread < -threshold,1,0)

linus791025 commented 7 years ago

Hi tsungwu,

I also noticed this issue. It seems more reasonable when the spread goes higher (lower) than the threshold and we are going to short (long) the spread. Furthermore, in order to make the code consistent, the code on page 133 should be reversed. Otherwise, you will get an opposite result.

Originally, the code on p.133 is:

for(i in 1:length(signal)) { if(signal[i] == 1 && position == 0) {

buy the spread

prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- -prev_x_qty qty_y[i] <- trade_size position <- 1

} if(signal[i] == -1 && position == 0) {

sell the spread initially

prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- prev_x_qty qty_y[i] <- -trade_size position <- -1 } if(signal[i] == 1 && position == -1) {

we are short the spread and need to buy

qty_x[i] <- -(round(beta[i] trade_size) + prev_x_qty) prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- 2 * trade_size position <- 1 } if(signal[i] == -1 && position == 1) {

we are long the spread and need to sell

qty_x[i] <- round(beta[i] trade_size) + prev_x_qty prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- -2 * trade_size position <- -1 } }

####################################### I think it should be:

for(i in 1:length(signal)){ if(signal[i] == 1 && position == 0){

buy the spread

prev_x_qty <- round(beta[i]*trade_size)
qty_x[i] <- prev_x_qty
qty_y[i] <- -trade_size
position <- 1

}

if(signal[i] == -1 && position == 0){

sell the spread initially

prev_x_qty <- round(beta[i]*trade_size)
qty_x[i] <- -prev_x_qty
qty_y[i] <- trade_size
position <- -1

}

if(signal[i] == 1 && position == -1){

we are short the spread and need to buy

qty_x[i] <- (round(beta[i]*trade_size) + prev_x_qty)
prev_x_qty <- round(beta[i]*trade_size)
qty_y[i] <- -2*trade_size
position <- 1

}

if(signal[i] == -1 && position ==1){

we are long the spread and need to sell

qty_x[i] <- -(round(beta[i]*trade_size) + prev_x_qty)
prev_x_qty <- round(beta[i]*trade_size)
qty_y[i] <- 2*trade_size
position <- -1

} }

Linus

hgeorgako commented 7 years ago

Thank you for this clarification. Can you please post a pull request on my github account for this fix?

On Feb 7, 2017, at 6:12 AM, linus791025 notifications@github.com wrote:

Hi tsungwu,

I also noticed this issue. It seems more reasonable when the spread goes higher (lower) than the threshold and we are going to short (long) the spread. Furthermore, in order to make the code consistent, the code on page 133 should be reversed. Otherwise, you will get an opposite result.

Originally, the code on p.133 is:

for(i in 1:length(signal)) { if(signal[i] == 1 && position == 0) {

buy the spread

prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- -prev_x_qty qty_y[i] <- trade_size position <- 1

} if(signal[i] == -1 && position == 0) {

sell the spread initially

prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- prev_x_qty qty_y[i] <- -trade_size position <- -1 } if(signal[i] == 1 && position == -1) {

we are short the spread and need to buy

qty_x[i] <- -(round(beta[i] trade_size) + prev_x_qty) prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- 2 * trade_size position <- 1 } if(signal[i] == -1 && position == 1) {

we are long the spread and need to sell

qty_x[i] <- round(beta[i] trade_size) + prev_x_qty prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- -2 * trade_size position <- -1 } }

####################################### I think it should be:

for(i in 1:length(signal)){ if(signal[i] == 1 && position == 0){

buy the spread

prev_x_qty <- round(beta[i]*trade_size) qty_x[i] <- prev_x_qty qty_y[i] <- -trade_size position <- 1 }

if(signal[i] == -1 && position == 0){

sell the spread initially

prev_x_qty <- round(beta[i]*trade_size) qty_x[i] <- -prev_x_qty qty_y[i] <- trade_size position <- -1 }

if(signal[i] == 1 && position == -1){

we are short the spread and need to buy

qty_x[i] <- (round(beta[i]*trade_size) + prev_x_qty) prev_x_qty <- round(beta[i]trade_size) qty_y[i] <- -2trade_size position <- 1 }

if(signal[i] == -1 && position ==1){

we are long the spread and need to sell

qty_x[i] <- -(round(beta[i]*trade_size) + prev_x_qty) prev_x_qty <- round(beta[i]trade_size) qty_y[i] <- 2trade_size position <- -1 } }

Linus

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

linus791025 commented 7 years ago

Hi Harry,

My apologies that your original code on p.133 and p.134 should be correct. I will look into this example and test it with more pairs.

The first time I tried your method on different pair assets, I got nice results while using the code I modified yesterday. It gave me an illusion that I might be correct. Now, I found that I was logically incorrect. Let me explain my understanding of the example shown from p.127 to p.136.

We are going to trade the spread between an ETF and a stock, namely, SPY and APPL. The regression model would look like this: y_t = beta_t-1 * x_t + e_t

where y_t is the price of SPY, x_t is the price of APPL, e_t is an error term, which should be normally distributed if the pair is cointegrated, beta_t-1 is the lag coefficient of x_t in the linear model by rolling method in this example.

I have an question here: the calculation of beta in this example is determined by the code below.

beta_out_of_sample <- rolling_beta(diff(dF), 10)

The spread seems much more normally distributed when I estimated the the rolling_beta without taking difference on their prices. Maybe you can check this part.

#######switch back to the original issue In this example, the spread is the error term. When the spread is lower than the threshold (one std here), we long the spread, which means we long SPY and short beta*APPL. #buys <- ifelse(data_out$spread < -threshold,1,0)

if(signal[i] == 1 && position == -1) {

we are short the spread and need to buy

qty_x[i] <- -(round(beta[i] trade_size) + prev_x_qty) #short betaAAPL prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- 2 trade_size #long SPY position <- 1 }

When the spread is higher than the threshold (one std here), we short the spread, which means we short SPY and long beta*APPL. #sells <- ifelse(data_out$spread > threshold,-1,0)

if(signal[i] == -1 && position == 1) {

we are long the spread and need to sell

qty_x[i] <- round(beta[i] trade_size) + prev_x_qty #long betaAAPL prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- -2 trade_size #short SPY position <- -1 }

The same logic can be applied to the other two rules.

if(signal[i] == 1 && position == 0) {

buy the spread

prev_x_qty <- round(beta[i] * trade_size)
qty_x[i] <- -prev_x_qty
qty_y[i] <- trade_size
position <- 1

}

if(signal[i] == -1 && position == 0) {

sell the spread initially

prev_x_qty <- round(beta[i] * trade_size)
qty_x[i] <- prev_x_qty
qty_y[i] <- -trade_size
position <- -1

}

So the original code looks reasonable. Just need to revise the definition of buys and sells signal. ################################# In my opinion, this strategy is profitable if we can find a cointegrated pair. In other words, we can estimate the model well and get a normally distributed spread. Otherwise, if the spread between two assets don't converge in the long run, we will face a big loss.

I will look into this example and post a pull request to share what I get. I really appreciate your quick response. This book is really helpful for me to clarify some concepts.

Furthermore, this is what I read recently regarding pair trading and I found it is a good supplementary material: "Statistical arbitrage pairs trading strategies: Review and outlook" (https://ideas.repec.org/p/zbw/iwqwdp/092015.html)

Thank you! Linus

tsungwu commented 7 years ago

Dear Linus: On more problem. On the bottom of page 135, the function of compute_equity_curve(), the last line defines equity as "equity <- cumulative_sell-cumulative_buy+positionprice" Because "sell" mean "short", hence, isn't it "equity <- cumulative_buy-cumulative_sell+positionprice" ?

linus791025 commented 7 years ago

Hi tsungwu,

The original code looks fine to me. When we calculate profit and loss, it is always based on selling price (market price) minus buying price (cost).

As a result, I think the original code for this part is correct. I have already tested this part.

Another 2 questionable things in this chapter are the "rolling beta" and the way we calculate beta. After I calculated betas without taking difference on the price, the spread looks more normally distributed. However, the trading strategy using rolling beta seems non-robust to me, if beta changes drastically!

For example, if you buy a spread on day 1 with beta equals to 1.5 and you get a selling signal on day 10. If the rolling beta becomes 3 on day 10, you will face a big loss. Because the spread doesn't converge. Maybe my explanation is not that clear. But I think you can think about it.

Anyway, the concept in this chapter is still helpful. But if we don't have sufficient background knowledge in this field, it is not easy to understand the logic behind the code. The best way to gain more pair trading ideas is to read some papers in this field. There are different ways to fit the model such as state space model, Kalman filter, and PCA.

tsungwu commented 7 years ago

I got it. Because I calculated net profit by differencing price, hence short has to take a negative sign to accumulate revenue. For the equity "curve", a portfolio idea usually uses price level movement to show Profit-Loss.

The rolling regression for me is a signal for long/short, I usually keeps it constant. When signal appears, I never follow time-varying betas to adjust position. For example, I usually take the hedge ratio=0.6, and a 10:6 position for 10-time Y and 6-time X. 10Y-6X=Spread