C-C-A-A / CologneProtocol-R

Research compedium for "Approaching Prehistoric Demography: Proxies, Scales and Scope of the Cologne Protocol in European contexts" including a manual of the Cologne Protocol
8 stars 4 forks source link

Kriging #5

Open Mono2 opened 4 years ago

Mono2 commented 4 years ago

How to deal with the following settings of the krige() function? nmax, nmin and maxdist?

Mono2 commented 4 years ago

Problem solved: An advice is given in the publication (nmin = 3, nmax = 10, maxdist = boundingbox diagonal/2) nmin and nmax have been changed. maxdist has to be changed (ideally it is automated)

Mono2 commented 4 years ago

If the above mentioned values are used the kriging results seem to be weird, at least for the Rössen data. I will prepare an image illustrating the problem. Anyhow, the values are still included!

Mono2 commented 4 years ago

Here is a picture of the kriging result of the sample data:

edge_effect

The picture shows the 3.5 km Isolinie calculated with the R script( blue) and the 3.5 km isoline of Wendt et al. (CRC database) in red. On the left side of the picture you can see a weird edge effect resulting in a large increase in included area (blue area). This is somehow related to nmin, nmax and maxdist in the krige function. If these values are not defined, this edge effect does not occur.

However, in the rest of the working area both 3.5 isolines fit reasonably well!

VicluUiB commented 4 years ago

Hi Mono,

I see that this issue was raised a little while ago so I might be a little late. I had a discussion with Robin about the fact that the kriging results (as far as my own data sets display) tend to exhibit nasty "spikes" (for lack of a better word) in various directions. I will try and summarize some examples. But just out of curiosity I have two questions. Do you know if the isolines by wendt et al are the original ones, or might they have been edited at a later stage with the help of the variance raster? Second, do you have any examples to show what the results look like in your case when you leave the specifications for nmin, nmax and maxdist?

Also, would you be able to show what the script should look like when excluding the parameters? As of now, I can run "00_LEC" without any problems when simply removing the values for nmin and nmax. However, I get an error during "01_KrigingR" when I simply delete the "bbox_diag/2," The error reads: "Error: unexpected '=' in: " maxdist = debug.level =""

Mono2 commented 4 years ago

Hi, I hope you are doing well! Unfortunately, I have no information about possible post-processing of the isoline shape-file by Wendt et al. But I do know that they did face this problem.

Basically, if you specify nmin, nmax and maxdist in the gstat::krige function, you choose to do a so-called local kriging. That means, as far as I understand, that for the prediction of a radius of LEC at a specific location a minimum of nmin and a maximum of nmax observations which have a maximum distance to that location of maxdist are used. If you do not use local kriging, the isolines will "grow like an onion" after most of your sites are within the isoline. This will avoid "edge effects" as shown above (but it increases the time of calculation). However there may be a certain isoline, the first one after 100 % of your sites are included, which will show a massive increase in area. This was at least the case with my data. But this is not a problem and it does not affect the results.

Technically, you should change within 01_Kriging.R the lines 51-58 to:

LEC_kriged <- gstat::krige(radiusLEC~1,
                           vertices_spdf,
                           grid,
                           model = vertices_vario_fit,
                           # nmin = your_nmin,
                           # nmax = your_nmax,
                           # maxdist = bbox_diag/2,
                           debug.level = -1)

So you add a # before nmin, nmax and maxdist. You don't have to change anything in 00_LEC.R.

The "spiky" appearance of your isolines may also related to a low number of sites. Try to plot selected isolines together with the voronoi diagram as I have done it below:

Spike

It can easily be seen that the "growing" of isolines is determined by the voronoi diagram. See for example the spike on the right or the border between two isolines on the left. Does your results look like this? Let me know if this was a solution to your "spikes" :-)

@RobinCGN Maybe we should add the option of using local kriging or not? Though local kriging is the default of the Cologne Protocol. What do you think?

VicluUiB commented 4 years ago

Hi Manuel, thanks for getting back to me this quick.

I gave the local kriging a try. Judging from the Kriging raster and the isolines, this already brought with it a large improvement. The kriging raster is much more smooth now and the isolines both taper around the site locations much closer, and also they do not stretch out into areas with high kriging variance. For example, if I generate the isolines in QGIS and display them as overlapping with the kriging variance raster then most, if not all isolines, do not go beyond the 2nd quantile of the variance raster.

However, this has brought with it the problem that I can no longer use the plot "increase of area per equidistance" to determine what might be the ODI. This is because the plot is almost completely flat, and it remains flat until it gets to the isoline that encloses 100% of the sites for which there is then a massive spike.

I dont have time to look into this more today, but I will make sure to provide some screenshots of what it looks like in the coming days.

Best,

Mono2 commented 4 years ago

Yeah you have to change the plot range. In 00_LEC.R restrict your_limit and your_breaks to isolines of interest, which do not include the one with the heavy increase of area. Additionally, add the following to the code of the 03_Visualisation_Export.R-file: ggplot2::scale_y_continuous(limit = SOME_NUMBER, breaks = SOME_OTHER_NUMBER). This will change the plot range of the y-axis. Replace SOME_NUMBER by an appropriate value for maximum increase of area (you may have a look at the Isolines_Stat.csv to choose the number. SOME_OTHER_NUMBER will change tick marks of the y-axis

# Increase of area
plot_incr_area <-
Isolines_stats %>%
  ggplot2::ggplot(ggplot2::aes(x = km_isoline, y = increase_Area)) +
  ggplot2::geom_line() +
  ggplot2::geom_point() +
  ggplot2::labs(title = "Increase of area per equidistance",
                x = "[km]",
                y = "Area [km²]") +
  ggplot2::scale_x_continuous(limit = your_limit,
                              breaks = your_breaks) +
  ggplot2::scale_y_continuous(limit = SOME_NUMBER,
                              breaks = SOME_OTHER_NUMBER) +
  ggplot2::theme_bw() +
  ggplot2::theme(plot.title = ggplot2::element_text(hjust = 0.5),
                 axis.text.x = ggplot2::element_text(angle = 90))
RobinCGN commented 4 years ago

Just to check, if i am getting this right:

Local Kriging (default in the CP)

LEC_kriged <- gstat::krige(radiusLEC~1,
                           vertices_spdf,
                           grid,
                           model = vertices_vario_fit,
                           nmin = your_nmin,
                           nmax = your_nmax,
                           maxdist = bbox_diag/2,
                           debug.level = -1)

Nmin, nmax, maxdist are defined!

Ordinary Kriging

LEC_kriged <- gstat::krige(radiusLEC~1,
                           vertices_spdf,
                           grid,
                           model = vertices_vario_fit,
                           # nmin = your_nmin,
                           # nmax = your_nmax,
                           # maxdist = bbox_diag/2,
                           debug.level = -1)