geocompx / geocompr

Geocomputation with R: an open source book
https://r.geocompx.org/
Other
1.52k stars 580 forks source link

Reviews - edition 2, round 2, part 2 #911

Closed Robinlovelace closed 1 year ago

Robinlovelace commented 1 year ago

Hot on the heels of https://github.com/geocompx/geocompr/issues/898

Clearly it does. We have created a placeholder for another foreword for the 2nd edition. We plan to wait until the manuscript is finished, or at least very close, before tackling this.

We agree this was not well written. Fixed. The relevant section now reads:

The wider context and motivations underlying this book are covered in Chapter 1.

Agreed. The sentence now reads:

The seed_generation tool takes a a raster dataset as its first argument (features); optional arguments include band_width that specifies the size of initial polygons.

The link2GI package just makes it easy to initiate a GRASS session from within R without the need to fully grasp how GRASS works in the background. However, for the interested reader or GRASS power users we have added the link to the GRASS help pages which show step-by-step how to do so. Please note that we have deleted the appendix showing the same instructions in favor of the GRASS help pages.

In the case of SAGA and GDAL, link2GI searches the system for the corresponding command line utilities and adds the corresponding paths for the current R session to the PATH variable. This is unnecessary in the case of qgisprocess since qgisprocess ensures by itself when being attached that a working QGIS version is installed on the system.

Please see previous reply and reply after the next.

You are right that terra is of course predominantly a raster processing package, however, it also supports vector features and rgrass expects terra::vect() objects as input.

We agree that the section in question is demanding and probably more suitable for experienced (GRASS) GIS users. The reasoning behind this is as follows:

In any case, we now warn the reader before jumping into the code as follows:

Please note that the code instructions in the following paragraphs might be hard to follow when using GRASS for the first time but by running through the code line-by-line and by examining the intermediate results, the reasoning behind it should become even clearer.

Thanks for noting, the description was indeed misleading. We have updated the corresponding sections after thoroughly reviewing what GRASS is actually doing in the background (see also https://github.com/geocompx/geocompr/issues/412).

Agreed. See https://github.com/geocompx/geocompr/commit/97edb6804b84aa73c16181b05819e499cbf747dc for fix

My point here was to emphasize that you cannot do statistical inference with ML, but I see why one can misinterpret the sentence. Thinking about it, the inference stuff does not add much value here but is obviously distracting. Therefore, we have removed it.

Secondly, I agree that the Bayesian approach to modeling is quite interesting, however, it is beyond the scope of the book and there are already books out there presenting it in much greater detail than this book ever could. Still, we have updated the section on including spatial autocorrelation in models as follows:

Here, when making predictions we neglect spatial autocorrelation since we assume that on average the predictive accuracy remains the same with or without spatial autocorrelation structures. However, it is possible to include spatial autocorrelation structures into models as well as into predictions. Though, this is beyond the scope of this book, we give the interested reader some pointers where to look it up:

  1. The predictions of regression kriging combines the predictions of a regression with the kriging of the regression's residuals [@goovaerts_geostatistics_1997; @hengl_practical_2007; @bivand_applied_2013].
  2. One can also add a spatial correlation (dependency) structure to a generalized least squares model [nlme::gls(); @zuur_mixed_2009; @zuur_beginners_2017].
  3. One can also use mixed-effect modeling approaches. Basically, a random effect imposes a dependency structure on the response variable which in turn allows for observations of one class to be more similar to each other than to those of another class [@zuur_mixed_2009]. Classes can be, for example, bee hives, owl nests, vegetation transects or an altitudinal stratification. This mixed modeling approach assumes normal and independent distributed random intercepts. This can even be extended by using a random intercept that is normal and spatially dependent. For this, however, you will have to resort most likely to Bayesian modeling approaches since frequentist software tools are rather limited in this respect especially for more complex models [@blangiardo_spatial_2015; @zuur_beginners_2017].

In the statistical learning chapter we focus on performance estimation. The big advantage of using mlr3 is that one can compare dozens or even hundreds of learners, resampling strategies and tasks using the same interface. If the learner in questions does not yet exist, it should be fairly easy to implement it in the mlr3extralearners package. Please refer also to reply to comment Pages 40ff.

I get the point, however, I have to admit that as far as I know the term "spatial prediction" is not reserved for modeling techniques incorporating the spatial structure in one form or another into the model itself. In any case, wherever possible we replaced "spatial prediction" with predictive mapping or spatial distribution.

At the beginning of the spatial cv with mlr3 section we point out why we are going to the trouble of learning the mlr3 syntax as follows:

There are dozens of packages for statistical learning, as described for example in the CRAN machine learning task view. Getting acquainted with each of these packages, including how to undertake cross-validation and hyperparameter tuning, can be a time-consuming process. Comparing model results from different packages can be even more laborious. The mlr3 package and ecosystem was developed to address these issues.

Secondly, spatial cross-validation is by no means a standard tool in R packages, only random cross-validation is. Finally, regarding your suggestion to explain Bayesian spatial models, please refer again to our reply to comment Page 32, 3rd para.

Robinlovelace commented 1 year ago
  • [ ] Foreword needs obviously to be rewritten - much has changed in the geospatial R world since 2018

Taking a look at this one...

Robinlovelace commented 1 year ago

This seems fixed to me :tada: