Open tsungwu opened 7 years ago
Hi tsungwu,
I also noticed this issue. It seems more reasonable when the spread goes higher (lower) than the threshold and we are going to short (long) the spread. Furthermore, in order to make the code consistent, the code on page 133 should be reversed. Otherwise, you will get an opposite result.
Originally, the code on p.133 is:
for(i in 1:length(signal)) { if(signal[i] == 1 && position == 0) {
prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- -prev_x_qty qty_y[i] <- trade_size position <- 1
} if(signal[i] == -1 && position == 0) {
prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- prev_x_qty qty_y[i] <- -trade_size position <- -1 } if(signal[i] == 1 && position == -1) {
qty_x[i] <- -(round(beta[i] trade_size) + prev_x_qty) prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- 2 * trade_size position <- 1 } if(signal[i] == -1 && position == 1) {
qty_x[i] <- round(beta[i] trade_size) + prev_x_qty prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- -2 * trade_size position <- -1 } }
####################################### I think it should be:
for(i in 1:length(signal)){ if(signal[i] == 1 && position == 0){
prev_x_qty <- round(beta[i]*trade_size)
qty_x[i] <- prev_x_qty
qty_y[i] <- -trade_size
position <- 1
}
if(signal[i] == -1 && position == 0){
prev_x_qty <- round(beta[i]*trade_size)
qty_x[i] <- -prev_x_qty
qty_y[i] <- trade_size
position <- -1
}
if(signal[i] == 1 && position == -1){
qty_x[i] <- (round(beta[i]*trade_size) + prev_x_qty)
prev_x_qty <- round(beta[i]*trade_size)
qty_y[i] <- -2*trade_size
position <- 1
}
if(signal[i] == -1 && position ==1){
qty_x[i] <- -(round(beta[i]*trade_size) + prev_x_qty)
prev_x_qty <- round(beta[i]*trade_size)
qty_y[i] <- 2*trade_size
position <- -1
} }
Linus
Thank you for this clarification. Can you please post a pull request on my github account for this fix?
On Feb 7, 2017, at 6:12 AM, linus791025 notifications@github.com wrote:
Hi tsungwu,
I also noticed this issue. It seems more reasonable when the spread goes higher (lower) than the threshold and we are going to short (long) the spread. Furthermore, in order to make the code consistent, the code on page 133 should be reversed. Otherwise, you will get an opposite result.
Originally, the code on p.133 is:
for(i in 1:length(signal)) { if(signal[i] == 1 && position == 0) {
buy the spread
prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- -prev_x_qty qty_y[i] <- trade_size position <- 1
} if(signal[i] == -1 && position == 0) {
sell the spread initially
prev_x_qty <- round(beta[i] * trade_size) qty_x[i] <- prev_x_qty qty_y[i] <- -trade_size position <- -1 } if(signal[i] == 1 && position == -1) {
we are short the spread and need to buy
qty_x[i] <- -(round(beta[i] trade_size) + prev_x_qty) prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- 2 * trade_size position <- 1 } if(signal[i] == -1 && position == 1) {
we are long the spread and need to sell
qty_x[i] <- round(beta[i] trade_size) + prev_x_qty prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- -2 * trade_size position <- -1 } }
####################################### I think it should be:
for(i in 1:length(signal)){ if(signal[i] == 1 && position == 0){
buy the spread
prev_x_qty <- round(beta[i]*trade_size) qty_x[i] <- prev_x_qty qty_y[i] <- -trade_size position <- 1 }
if(signal[i] == -1 && position == 0){
sell the spread initially
prev_x_qty <- round(beta[i]*trade_size) qty_x[i] <- -prev_x_qty qty_y[i] <- trade_size position <- -1 }
if(signal[i] == 1 && position == -1){
we are short the spread and need to buy
qty_x[i] <- (round(beta[i]*trade_size) + prev_x_qty) prev_x_qty <- round(beta[i]trade_size) qty_y[i] <- -2trade_size position <- 1 }
if(signal[i] == -1 && position ==1){
we are long the spread and need to sell
qty_x[i] <- -(round(beta[i]*trade_size) + prev_x_qty) prev_x_qty <- round(beta[i]trade_size) qty_y[i] <- 2trade_size position <- -1 } }
Linus
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Hi Harry,
My apologies that your original code on p.133 and p.134 should be correct. I will look into this example and test it with more pairs.
The first time I tried your method on different pair assets, I got nice results while using the code I modified yesterday. It gave me an illusion that I might be correct. Now, I found that I was logically incorrect. Let me explain my understanding of the example shown from p.127 to p.136.
We are going to trade the spread between an ETF and a stock, namely, SPY and APPL. The regression model would look like this: y_t = beta_t-1 * x_t + e_t
where y_t is the price of SPY, x_t is the price of APPL, e_t is an error term, which should be normally distributed if the pair is cointegrated, beta_t-1 is the lag coefficient of x_t in the linear model by rolling method in this example.
beta_out_of_sample <- rolling_beta(diff(dF), 10)
The spread seems much more normally distributed when I estimated the the rolling_beta without taking difference on their prices. Maybe you can check this part.
#######switch back to the original issue In this example, the spread is the error term. When the spread is lower than the threshold (one std here), we long the spread, which means we long SPY and short beta*APPL. #buys <- ifelse(data_out$spread < -threshold,1,0)
if(signal[i] == 1 && position == -1) {
qty_x[i] <- -(round(beta[i] trade_size) + prev_x_qty) #short betaAAPL prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- 2 trade_size #long SPY position <- 1 }
When the spread is higher than the threshold (one std here), we short the spread, which means we short SPY and long beta*APPL. #sells <- ifelse(data_out$spread > threshold,-1,0)
if(signal[i] == -1 && position == 1) {
qty_x[i] <- round(beta[i] trade_size) + prev_x_qty #long betaAAPL prev_x_qty <- round(beta[i] trade_size) qty_y[i] <- -2 trade_size #short SPY position <- -1 }
The same logic can be applied to the other two rules.
if(signal[i] == 1 && position == 0) {
prev_x_qty <- round(beta[i] * trade_size)
qty_x[i] <- -prev_x_qty
qty_y[i] <- trade_size
position <- 1
}
if(signal[i] == -1 && position == 0) {
prev_x_qty <- round(beta[i] * trade_size)
qty_x[i] <- prev_x_qty
qty_y[i] <- -trade_size
position <- -1
}
So the original code looks reasonable. Just need to revise the definition of buys and sells signal. ################################# In my opinion, this strategy is profitable if we can find a cointegrated pair. In other words, we can estimate the model well and get a normally distributed spread. Otherwise, if the spread between two assets don't converge in the long run, we will face a big loss.
I will look into this example and post a pull request to share what I get. I really appreciate your quick response. This book is really helpful for me to clarify some concepts.
Furthermore, this is what I read recently regarding pair trading and I found it is a good supplementary material: "Statistical arbitrage pairs trading strategies: Review and outlook" (https://ideas.repec.org/p/zbw/iwqwdp/092015.html)
Thank you! Linus
Dear Linus: On more problem. On the bottom of page 135, the function of compute_equity_curve(), the last line defines equity as "equity <- cumulative_sell-cumulative_buy+positionprice" Because "sell" mean "short", hence, isn't it "equity <- cumulative_buy-cumulative_sell+positionprice" ?
Hi tsungwu,
The original code looks fine to me. When we calculate profit and loss, it is always based on selling price (market price) minus buying price (cost).
Another 2 questionable things in this chapter are the "rolling beta" and the way we calculate beta. After I calculated betas without taking difference on the price, the spread looks more normally distributed. However, the trading strategy using rolling beta seems non-robust to me, if beta changes drastically!
For example, if you buy a spread on day 1 with beta equals to 1.5 and you get a selling signal on day 10. If the rolling beta becomes 3 on day 10, you will face a big loss. Because the spread doesn't converge. Maybe my explanation is not that clear. But I think you can think about it.
Anyway, the concept in this chapter is still helpful. But if we don't have sufficient background knowledge in this field, it is not easy to understand the logic behind the code. The best way to gain more pair trading ideas is to read some papers in this field. There are different ways to fit the model such as state space model, Kalman filter, and PCA.
I got it. Because I calculated net profit by differencing price, hence short has to take a negative sign to accumulate revenue. For the equity "curve", a portfolio idea usually uses price level movement to show Profit-Loss.
The rolling regression for me is a signal for long/short, I usually keeps it constant. When signal appears, I never follow time-varying betas to adjust position. For example, I usually take the hedge ratio=0.6, and a 10:6 position for 10-time Y and 6-time X. 10Y-6X=Spread
On page 131, the chunk about "#Generate sell and buy signals"
buys <- ifelse(data_out$spread > threshold,1,0); sells <- ifelse(data_out$spread < -threshold,-1,0)
Shouldn't it be sells <- ifelse(data_out$spread > threshold,-1,0); buys <- ifelse(data_out$spread < -threshold,1,0)