r-spatial / spdep

Spatial Dependence: Weighting Schemes and Statistics
https://r-spatial.github.io/spdep/
116 stars 26 forks source link

Factor perm #99

Closed rsbivand closed 1 year ago

rsbivand commented 1 year ago

@JosiahParry please review changes to moran_bv() and localmoran_bv(). The extraction of the shared conditional permutation elements seems to have gone OK, at head of R/lisa_perm.R. The local join count sketches are much poorer than the existing global measures - should take factors not logical/integer, and should accommodate a choice of level rather than impose 0/1 only. Please see https://doi.org/10.1016/j.jas.2020.105306 and https://github.com/rsbivand/LICD_article, based on and extending https://doi.org/10.1016/j.spasta.2017.03.003. If we would like to follow up the LICD approach, I could ask my co-authors for help, as we had intended to work up the prototypes used in the 2017 and 2021 articles for addition to spdep.

JosiahParry commented 1 year ago

Thanks! I'll do my best to review this today and tomorrow morning.

JosiahParry commented 1 year ago

@rsbivand, can you confirm the permutation approach taken in this PR? My personal notes are below

rsbivand commented 1 year ago

parallel_setup() is used in any function that uses conditional permutation for p-value calculation

No, It just determines whether parallel is being used or not, and if so how (snow on Windows, multicore elsewhere, how many cores), based on local option settings in the package.

run_perm is a generalization that lets you run a function over indexes using bootstrap resampling

Yes and no; the resampling is more like permutation bootstrap, rather than parametric bootstrap. Its arguments include the function called and an environment containing the required objects (environments are pass by reference and so faster than regular pass by value).

probs_lut creates a probability lookup table based on the number of permutations done which helps calculate p-value

Yes, in the [0,1] interval from punif().

functions for each measure are created such as perm[measure]_int() which takes an index position and can be applied across threads using {parallel}

Yes. If the local options are set, it will use parallel, see inst/tinytest/test_lisa_perm.R for usage of the local options.

Matrix multiplication goes straight to BLAS compiled code, and we know that the ordering of the samples makes no difference for the draw, so using lag.listw() is wasteful (and we are only looking at one observation in each call to the *_int() function.)