Open tomwenseleers opened 5 years ago
Yes, it's funny you brought this up just now. I have a big revision of these notes almost ready that has been lingering around for ages.
As you point out the starting values are a big issue, also related to convergence problems in some cases.
I expect the updated notes to get posted soon, perhaps today...
On 8/8/18, Tom Wenseleers notifications@github.com wrote:
I noticed that all algo's listed are significantly slower than the inbuilt glm.fit. Why is this? Does glm.fit use better starting values perhaps? Or is this due to the use of .Call(C_Cdqrls as opposed to the use of solve() in the glm.fit code?
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/bwlewis/GLM/issues/4
Many thanks for this - I found your notes to be really helpful!
Interestingly though, for a very small problem I was testing here now (3 variables, 1000 cases, Poisson error) I found the irls
function still to be about 4x slower than the inbuilt glm.fit
one, even if I specify identical sensible starting values for both. I thought this speed difference could come about from a difference in speed of x = solve(crossprod(A,W*A), crossprod(A,W*z), tol=2*.Machine$double.eps)
in the irls
function vs. x = :.Call(C_Cdqrls, A * W, z * W, min(1e-7, control$epsilon/1000), check=FALSE)
in glm.fit
, but not sure (I'm running Microsoft Open R 3.5.0 on Windows with Intel MKL optimized BLAS loaded)... So although differences in the choice of starting values may explain some of the discrepancy in timings, it's probably still not the whole story...
Looking forward to your update!
I don't have time to finish the new write up today, but try this new code attached (as a gzipped file, that's all that GitHub would accept somehow). It's a very nearly complete new reference implementation using the SVD approach. On modern systems linked to a decent BLAS library it should be faster than glm.fit for large problems. For example, on my quad-core AMD home PC I ran this quick and dirty test:
source("glm.svd.r")
x = matrix(rnorm(10000 * 1000), 10000)
y = sample(c(0,1), 10000, replace=TRUE)
system.time({m=glm.fit(x,y,family=binomial())})
# user system elapsed
# 40.588 0.120 40.687
system.time({m1=glm.svd(x,y,family=binomial())})
# user system elapsed
# 17.724 0.968 5.335
This code will be posted to the repository as soon as I can finish the new write up, which is quite involved...
Ha that's great - many thanks for that! I see you use the same initialization strategy now as in R's implementation, using eval(family$initialize)! On my 4 core laptop with Intel MKL loaded this is indeed 6 times faster than glm.fit (finishing in 4s vs 25s for glm.fit)! That's really great - thanks a lot!
Might be worth changing the irls function that you lay out to something like this, i.e. with correct initialization :
irls = function(X, y, weights=rep(1,nrow(X)), family=poisson(log), maxit=25, tol=1e-16) {
if (!is(family, "family")) family = family()
variance = family$variance
linkinv = family$linkinv
mu.eta = family$mu.eta
etastart = NULL
nobs = nrow(X) # needed by the initialize expression below
nvars = ncol(X) # needed by the initialize expression below
eval(family$initialize) # initializes n and fitted values mustart
eta = family$linkfun(mustart) # we then initialize eta with this
dev.resids = family$dev.resids
dev = sum(dev.resids(y, linkinv(eta), weights))
devold = 0
beta_old = rep(1, nvars)
for(j in 1:maxit)
{
mu = linkinv(eta)
varg = variance(mu)
gprime = mu.eta(eta)
z = eta + (y - mu) / gprime # potentially -offset if you would have an offset argument as well
W = weights * as.vector(gprime^2 / varg)
beta = solve(crossprod(X,W*X), crossprod(X,W*z), tol=2*.Machine$double.eps)
eta = X %*% beta # potentially +offset if you would have an offset argument as well
dev = sum(dev.resids(y, mu, weights))
if (abs(dev - devold) / (0.1 + abs(dev)) < tol) break
# or if (norm(beta-beta_old, "2") < tol) break
devold = dev
beta_old = beta
}
list(coefficients=beta, iterations=j)
}
I noticed that all algo's listed are significantly slower than the inbuilt
glm.fit.
Why is this? Doesglm.fit
use better starting values perhaps? Or is this due to the use of.Call(C_Cdqrls...)
inglm.fit
as opposed to the use ofsolve()
in theirls
code?