Closed katzfuss closed 5 years ago
hm... it seems to me this is already happening. https://github.com/katzfuss-group/GPvecchia/blob/25a7b6b/src/Matern.cpp is there something that makes you think it doesn't work?
Is there a minus sign in lines 36 and 47?
Did you test whether it actually speeds things up? With, say n=40,000 and m=20, createU should be much faster for smoothness = 0.5 or 1.5 than for, say smoothness=.51 or 1.51.
@dzilber : Please also make sure that this speeds things up for VL.
Minus sign fixed. I did test it and it seems much faster, as expected. Here is the code
#### Matern.cpp timing test
library(Rcpp)
sourceCpp('/home/marcin/GPvecchia/GPvecchia/src/Matern.cpp')
set.seed(1988)
spatial.dim=2 # number of spatial dimensions
n=10000 # number of observed locs
# simulate locations
locs = cbind(runif(n),runif(n))
# define cov. function
covfun <- function(locs, covparms){
sig2*MaternFun(fields::rdist(locs),covparms)
}
# quick evaluation
sig2=1; range=.1; smooth=1.5
me.var = 1e-8
covparms =c(sig2,range,smooth)
t0 = proc.time()
Sig = covfun(locs, covparms)
fast = proc.time() - t0
cat("fast version: \n")
print(fast)
# slow evaluation
sig2=1; range=.1; smooth=1.2
me.var = 1e-8
covparms =c(sig2,range,smooth)
t0 = proc.time()
Sig = covfun(locs,covparms)
slow = proc.time() - t0
cat("slow version: \n")
print(slow)
and here is the output:
> source('~/GPvecchia/GPvecchia/tests/Matern-timing.r')
fast version:
user system elapsed
9.102 0.791 9.894
slow version:
user system elapsed
145.254 0.814 145.920
The minus signs were in place because the code was not giving the correct result.
source("../GPvecchia/server/importer.R")
#Generate 1D data
n = 100
locs=matrix(runif(n),ncol=1)
covparms=c(1,.2,1.5)
Om0 <- Matern(fields::rdist(locs),
range=covparms[2],
smoothness=covparms[3])
y=as.numeric(t(chol(Om0))%*%rnorm(n))+1
nuggets=rep(.1,n)
z = rnorm(n,y,sqrt(nuggets))
# Approximate posterior
m=10
vecchia.approx = vecchia_specify(locs,m)
preds.1p5 = vecchia_prediction(z,vecchia.approx,covparms,nuggets)
covparms=c(1,.2,1.499)
preds.1p499 = vecchia_prediction(z,vecchia.approx,covparms,nuggets)
# Plot of predictions
loc_ord=order(locs)
par(mfrow=c(1,2))
plot(locs, z, main = "Predictions")
lines(locs[loc_ord], preds.1p5$mu.obs[loc_ord], type="l", col=3)
lines(locs[loc_ord], preds.1p499$mu.obs[loc_ord], type="l", col=2)
legend("topright", legend=c("z", "1.5", "1.499"),col = c(1,3,2),
lty = c(NA,1,1), pch = c(1, NA, NA))
@dzilber I remember I fixed this minus (and indeed it's not there) so can we close this? On the other hand The section U_NZentries
corresponding to the Matern function has been commented out. Did you do it? If not are you aware of the reason why this is commented out? If neither, I will go ahead and uncomment it
Something is still not right with smooth=1.5. See the following code. Approximation is great for any smooth value except for 1.5:
# parameters
smooth=1.5
covparms=c(1,.3,smooth) # variance,range,smoothness
nugget=1^2
# locations
n=10^2
grid.oneside=seq(0,1,length=sqrt(n))
locs=as.matrix(expand.grid(grid.oneside,grid.oneside))
# vecchia approximation
m=30
vecchia.approx=vecchia_specify(locs,m,cond.yz='y')
U.obj=createU(vecchia.approx,covparms,nugget)
U=U.obj$U
Sig.y.inv.hat=as.matrix(Matrix::tcrossprod(U[U.obj$latent,U.obj$latent]))
# exact covariance
locs.ord=vecchia.approx$locsord
Sig.y=covparms[1]*Matern(rdist(locs.ord),range=covparms[2],smoothness=covparms[3])
Sig.y.inv=solve(Sig.y)
# plot
plot(as.numeric(Sig.y.inv),as.numeric(Sig.y.inv.hat))
abline(c(0,1))
So I looked at the source code of fields and there seems to be a discrepancy between the formulas found in Stein and the ones used by fields
. I verified that for smoothness=1.5 the functions are proportional to each other but not equal. I tried reconciling it using other sources but each one seems to give a slightly different version of the function.
However, I wonder if we should follow what fields
does (see here: https://github.com/NCAR/fields/blob/master/fields/R/Matern.R) and always use the Bessel function. Admittedly, it is somewhat slower, but even for a 1000 x 1000 matrix, the overall time is not overwhelming either way.
nrep = 100
covparms[3] = 1.3
D = matrix(runif(1000000), ncol=1000)
start = proc.time()
for(rep in seq(nrep)){
MaternFun(D, covparms)
}
bessel.time = (proc.time() - start)/nrep
covparms[3] = 1.5
start = proc.time()
for(rep in seq(nrep)){
MaternFun(D, covparms)
}
poly.time = (proc.time() - start)/nrep
print(paste("Avg. time using the full formula: ", bessel.time[3], sep=""))
print(paste("Avg. time using the polynomial formula: ", poly.time[3], sep=""))
The output is
[1] "Avg. time using the full formula: 1.66491"
[1] "Avg. time using the polynomial formula: 0.115870000000004"
I will keep trying to find the right formula though
Let's discuss during our next meeting. But please note that the Bessel formula is more than 10 times slower, which is huge!!
alright, I was able to figure out the constants and tested it using the code provided. It seems to work now. @katzfuss if you have a spare moment, could you please verify on your end? Thanks!
You should think about what the use case is for having the matern function go faster for specific values of the smoothness. You can always just write separate covariance functions for the special cases.
On Thu, Jul 11, 2019 at 12:18 PM Marcin Jurek notifications@github.com wrote:
alright, I was able to figure out the constants and tested it using the code provided. It seems to work now. @katzfuss https://github.com/katzfuss if you have a spare moment, could you please verify on your end? Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/katzfuss-group/GPvecchia/issues/56?email_source=notifications&email_token=ABWF7G3NHGOULEGQV4PC5RDP65MM7A5CNFSM4HM5WWA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZXHB7Q#issuecomment-510554366, or mute the thread https://github.com/notifications/unsubscribe-auth/ABWF7G6NAKGSE6URRZWNZ3DP65MM7ANCNFSM4HM5WWAQ .
We should ensure that the Bessel evaluations are avoided if Matern is used with smoothness=1.5, see e.g. here https://en.wikipedia.org/wiki/Mat%C3%A9rn_covariance_function#Simplification_for_%CE%BD_half_integer
We need to figure out the equivalent parameterization.