Closed Thie1e closed 1 year ago
Hi Christian,
I am afraid the constraints you describe are not supported in trackingPortfolio
.
But a similar example is described in the NMOF book, in the chapter on portfolio optimization: minimize variance of a portfolio with a cardinality constraint. (If you can access sciencedirect, you can get the chapter from there. The relevant section is called A simple hybrid: Local Search and QP). The code is in the GitLab repository, starting here: https://gitlab.com/NMOF/NMOF2-Code/-/blob/master/14_Portfolio_optimization/R/Portfolio_Optimization.R#L912
In fact, the code in the vignette should get you started, too: as you describe it, your problem becomes straightforward once the assets are selected. So I'd run a local-search algorithm [TAopt
would be my preferred choice] that selects assets, and then use the selected assets as inputs for trackingPortfolio
. Essentially, it means to call trackingPortfolio
in the objective function. If you don't want zero-weights, you can enforce a reasonable minimum weight.
Hope that helps Enrico
Hi Enrico,
thank you for your quick and helpful answer. I will go through the chapter you mentioned. The example application there (the hybrid) looks indeed quite similar to mine.
The second approach (TAopt
and trackingPortfolio
with non-zero weights) should work, too. Let me play around with these approaches and I will then get back to you here.
Best, Christian
Hi Enrico,
here is my solution using TAopt
and trackingPortfolio
in the objective function. Does this look OK? I could share some of the raw data privately but can't post it here.
# Objective function:
# Tracking error after calculating weights for tracking a benchmark
OF_track <- function(x, Data) {
returns <- cbind(Data$y, Data$X[, x])
cov_ret <- cov(returns)
sol.ls <- trackingPortfolio(var = cov_ret, R = returns, wmax = wmax, wmin = wmin, method = "ls")
port_ret <- Data$X[, x] %*% sol.ls
return(sd(Data$y - port_ret)) # Tracking Error
}
# keep cardinality (number of stocks) constant
neighbour_cardi <- function(x, Data) {
Ts <- which(x)
Fs <- which(!x)
lenTs <- length(Ts)
O <- sample.int(lenTs, 1L)
I <- sample.int(Data$p - lenTs, 1L)
x[c(Fs[I], Ts[O])] <- c(TRUE, FALSE)
x
}
# Generate a random solution for fixed number of stocks and then run TAopt
x0 <- c(rep(F, Data$p - n_stocks), rep(T, n_stocks))
x0 <- sample(x0, replace = F)
algo <- list(nT = 7L, ## number of thresholds
nS = 30L, ## number of steps per threshold
nD = 200L, ## number of random steps to compute thresholds
neighbour = neighbour_cardi,
x0 = x0,
printBar = T)
message("Starting TAopt...")
sol1 <- TAopt(OF_track, algo = algo, Data = Data)
# Solution:
print(paste("Best solution from TAopt:", round(sol1$OFvalue, 4)))
which(sol1$xbest) ## the selected regressors
# Calculate weights using trackingPortfolio for best solution
returns <- cbind(Data$y, Data$X[, sol1$xbest])
cov_ret <- cov(returns)
sol.ls <- trackingPortfolio(var = cov_ret, R = returns, wmax = wmax, wmin = wmin, method = "ls")
To test the tracking with that solution, I ran a time series cross validation from 2001 - 2021 for the Russell 1000. The index value should be tracked by a basket of 100 stocks. The stock universe consisted of the stocks that were part of the Russell 1000 at the respective times (no survivorship bias, hopefully).
The training set was a moving window of the last 500 trading days. Testing on the subsequent 40 trading days, then moving by 40 days, with a training period from roughly 2001 to 2021 resulted in 112 training 'slices'.
Using trackingPortfolio
in the objective function is of course relatively slow. With the lowered numbers in algo
the cross validation still took around 6 hours.
This is what the result looks like:
The backtest is excluding transaction costs.
What about unadjusted prices - I am not sure if the Russell 1000 includes dividends or not. I assume it does not, so I used unadjusted prices. The tracking seems to be better when using unadjusted prices. But in reality, the investor would of course receive those dividends, so the backtest with adjusted prices should be the more 'realistic' one, right? Of course after also considering costs.
Before I discovered NMOF or TAopt, I used a glmnet to regress the returns of the single stocks onto the index returns. With alpha = 1
(lasso regression) I can then pick the penalty parameter such that I get the desired number of stocks in the stock portfolio.
The tracking with this method is comparable (?) to the tracking by TAopt and it runs much faster, of course. The above simulation with TAopt took about 6 hours (single threaded). With glmnet it finishes in about 15 minutes. Result using glmnet:
Regarding the tracking error (TE): If I calculate the tracking error as the standard deviation of index returns minus portfolio returns based on the end-of-year values in my simulation, I get
TE with TAopt (same simulation as the chart above): 6.2% Ann. return with TAopt: 10.6% (benchmark 10.3%) TE with glmnet: 3.57% Ann. return with glmnet: 9.5% (benchmark 10.3%)
Another observation I have made based on my backtests is that the portfolios selected by TAopt seem to be more instable than the ones selected by glmnet: Again, the portfolios consisted of 100 stocks that should track the Russell 1000, and with TAopt I get a median of 85 differing tickers between time slices in the simulation (so nearly the whole portfolio gets exchanged every 40 days). With glmnet I get a median of 20 differing tickers. This more stable result from glmnet would be of course important in practice because of transaction costs.
Another difference between the models is the distribution of portfolio weights. I put trackingPortfolio
into the objective function with a minimum weight of 0.001. This results in the following distribution of portfolio weights (all CV slices combined):
So it seems to me that TAopt
sets a lot of weights to the minimum to filter out these stocks as far as possible.
The distribution of weights from glmnet is much smoother:
Sorry that this response has become so long and thank you very much, again, for your answers. So to sum up, my questions are:
TAopt
with trackingPortfolio
correct?TAopt
, since the tracking error is higher than with glmnet?Several years ago by now, I also studied econometrics, but I was never concerned with these types of portfolio optimizations. I simply have the feeling that there must be a better way than my glmnet method that I have just 'made up' (although there are some papers using glmnet for portfolio optimization, I think).
Best, Christian
Edit: I have updated the figure and numbers for TAopt. I think I reported the results with adjusted instead of unadjusted prices before. The results are from a backtest with window of 150 days, let me rerun the test with a window of 40 days over night...
Here's the backtest based on TAopt and a CV-window of 40 days (trainset still a moving window of the last 500 days):
This time I get a tracking error of 4.7% and a slightly lower ann. return. It seems to me that these differences are somewhat random, caused by the different CV-window lengths.
1) I had a quick look, and it seems okay. (But to be sure, I'd need a code example I can really run.) Is there any particular reason why you used method "ls" in trackingPortfolio? This will be much slower than "qp".
2) Unadjusted prices are what you trade. So it is fine to use them.
3) There is no single answer to that question; but a
difference in tracking errors of three percentage
points (as between TAopt and glmnet) is huge (provided
that the glmnet answers do not violate any constraints).
So my guess is that TAopt would need more time to find a good
solution. But it's hard to say without a reproducible
example.
4) TAopt is a stochastic method, and so repeated runs might have different results. However, this randomness can be made very small, by allowing more iterations. This is discussed a lot in the NMOF book, and also in https://link.springer.com/article/10.1007/s10732-010-9138-y or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1140655 .
Hi Enrico,
thank you for answering again.
I think I have a better understanding of methods for index tracking now and since I got trackingPortfolio
to work in TAopt
I am going to close this issue now. Thanks again for your help.
Hi Enrico,
I took some time to go through the help pages as well as the vignettes for NMOF, but I am still not sure if my use case is supported or not.
I am trying to understand how to build a portfolio out of stocks that tracks an index / a benchmark while holding only a certain number of stocks. For example: "Track the S&P 500 as closely as possible using 50 stocks at any time."
With
trackingPortfolio
I can get the weights, but only after selecting the stocks. Also, some weights will be zero, so that I don't have control over the number of assets in my portfolio. NMOF has a vignette for asset selection, but it assumes equal weighting.Is there a way to define the number of assets for index tracking? Maybe you could guide me in the right direction.
Thank you very much in advance.
Best, Christian