Open lea-c opened 6 months ago
Hi, I've analysed the problem more in depth and it seems to stem from a projection error. The points in val_sp are projected too far away from the points used to fit the variogram and it may cause a memory overload ? This is my hypothesis here. Unfortunately, I still have not been able to solve the issue.
Fyi, here is a dataset for which the 300 simulations can be carried out: https://filesender.renater.fr/?s=download&token=c23e2359-6334-42ae-b425-2127bd302b63 It contains a lot more points that val_sp so this is in line with the fact that the number of points in val_sp plays is not responsible here.
This dataset has the same CRS and coordinates of similar magnitude to val_sp's, but it does not look like their extents overlap. Weirdly, when I plot grd and then val_sp, they look like they are in the exact same projection.
grd <- readRDS("./fine_data.Rdata")
plot(grd)
plot(val_sp, add=T)
But when I plot val_sp before grd, an inconsistency appears, and the plots don't overlap.
plot(val_sp)
plot(grd, add=T)
I've dealt with many CRSs issues before but I cannot figure this one out. I've reprojected the datasets over and over to make sure the CRSs match, but without success. Could you just help me make sure that val_sp and grd align? This will certainly solve the issue. Thank you very much for your help! And again, I remain available if you have any question
Thanks for the update; I've looked at the issue and can reproduce the error you're getting. I can see where it happens when run in a debugger, but haven't been able to identify its cause.
Thank you very much for looking into this problem! Here is another update: actually I don't think this is a CRS issue. I found a dataset very similar to mine and with the same CRS which does not cause any issue. You can download the new data here... https://filesender.renater.fr/?s=download&token=7bc3178c-2c7a-4fd7-89b4-b15e2afcb419 ...and run the following code:
library(sp)
library(gstat)
# Load variogram
variogram <- readRDS("./variogram.Rdata")
# Load the two datasets
spatial_points_problem <- readRDS("./spatial_points_problem.Rdata")
spatial_points_working <- readRDS("./spatial_points_working.Rdata")
# Compare CRS
proj4string(spatial_points_problem)
proj4string(spatial_points_working)
# Compare extents
plot(extent(spatial_points_problem@bbox), col="red")
plot(extent(spatial_points_working@bbox), col="blue",add=T)
# Check for duplicates
zerodist(spatial_points_problem, z=0.9)
zerodist(spatial_points_working, z=0.9)
# Compare number of points - the only difference here but again, it should not cause any issue
print(nrow(spatial_points_problem))
print(nrow(spatial_points_working))
# Plot the two datasets
plot(spatial_points_problem, col="red")
plot(spatial_points_working, col="blue", add=T)
# Works
gstat:::predict.gstat(variogram, newdata = spatial_points_working, nsim=10) # I get a warning here due to the fact that I had to modify the dataset a bit but I don't think this is important
# Crashes
gstat:::predict.gstat(variogram, newdata = spatial_points_problem, nsim=10) # crashes also with various maxdist and nmax values
This is very weird...
Hi again, I tried to go further in diagnosing the problem and identified the points that may be responsible for the error. Below is a figure where:
raster::intersect(spatial_points_problem, spatial_points_working)
are in greenThe predict.gstat function runs fine with nsim>0 on the green points: however, it crashes as soon as a red point is added.
Also, I have tried to debug the function myself but I have not been able to run it outside the package, because of the C calls.
I hope this piece of additional information is helpful
Best regards
This might be a clue:
> zerodist2(spatial_points_problem, variogram$data$awc_0_30_m$data)
[,1] [,2]
[1,] 519 1711
> zerodist2(spatial_points_working, variogram$data$awc_0_30_m$data)
[,1] [,2]
g = gstat:::predict.gstat(variogram, newdata = spatial_points_problem[-519,], nsim=10)
works!
Sorry for my late reply. It works, thank you so much for your time and effort !
Thanks; note that this is a work-around, the bug is still present.
Yes, I've noticed: for now I just removed tens of points from my dataset (larger than the one I sent to you). I would be curious to understand what happens.
Hi,
I've been encountering a problem I don't understand with the predict.gstat function.
I would like to use it to interpolate residuals using a cokriging model. Running the function with nsim=0 works fine but if I increase it to a larger number (even a small one like 10, although I need to run it eventually with nsim=300) it automatically causes the following error, even on computers with large computing power and memory:
*** caught segfault *** address 0x38, cause 'memory not mapped'
Could you help me identify the issue ? I checked for duplicate coordinates with zerodist, for unmatching CRSs, I tried with many values of maxdist and nmax, as well as with fewer points, so I'm at a loss here.
Below is a minimal example, and here is the link to download data: https://filesender.renater.fr/?s=download&token=911d7d4c-6bf5-4bec-a316-8adc5417b9d8 Thank you very much in advance. I remain available for any complementary information.
library(gstat)
# Load variogram
load("./var_awc.RData")
awc_sem <- cv.fit
# Load data to predict
val_sp <- readRDS("./point_data.Rdata")
# Works fine
gstat:::predict.gstat(awc_sem, newdata = val_sp)
# Crashes
gstat:::predict.gstat(awc_sem, newdata = val_sp, nsim=10, maxdist=500)