Closed JosiahParry closed 2 years ago
I've adjusted the localC()
and localC_perm()
functions to support a multivariate local geary measure.
This PR is now ready for review!
Thanks, very useful. I'll get to this maybe next week. I have other work in progress on permutations, which are generally a bit odd in PySAL and GeoDa (I think they imply a two sided 10% not 5% because the fold the ends from [0, 1] to [0, 0.5], and take 0.05 as the 5% cutoff, but I'm still checking slowly). Please also de-roxygenize the code and create a manual Rd file, spdep will never be roxygenized. Consider also using the African conflicts data set (in spData with neighbour object), and maybe a one-sided formula for the multivariate case.
Sounds good!
Given your feedback on the multivariate case, would you be okay with making localC
an S3 generic with the methods
Regarding the calculated p-values, would it make more sense to take advantage of the ecdf? For example an implementation could look like
library(spdep)
alternative <- "two.sided"
listw <- nb2listw(spData::africa.rook.nb)
x <- spData::afcon$totcon
obs <- localG(x, listw)
reps <- replicate(299, localG(sample(x), listw))
ecdfs <- apply(reps, 1, ecdf)
left_p <- mapply(function(ecdf, ob) ecdf(ob), ecdfs, obs)
switch(alternative,
less = left_p,
greater = 1 - left_p,
two.sided = pmin(1 - left_p, left_p) * 2)
I might even argue that moving towards https://doi.org/10.1016/j.tree.2021.10.009 and the legend on p.4 could be a way forward, or using the alpha channel for an HH/LL/LH/HL cluster map might be worth looking at. I'm just thinking aloud; including a shiny FDR style chooser map, default no multiple comparisons corrections, then the p.adjust()
alternatives? Might someone be interested - is it wporth pursuing? As soon as the split with spatialreg finalizes (could be very soon), spdep needs moving into the current century, doesn't it?
Thank you for sharing the article! I think this could be an interesting idea and absolutely worth exploring. I unfortunately am not as well versed in p adjustment methods and how they relate to the false discovery rate. If you have any suggested readings, I'll be willing to dive into that.
Regarding the PR: I've implemented the changes requested. I've tested them as well (what about using testthat
for package test in the modernization effort?) and the results from each of the methods match each other.
Changes submitted:
alternative
to localC_perm()
with possible values "less"
and "greater"
.localC.R
spData
I created a localC
branch in from main
with the aim of merging from there when checked. Did you addd an Rd file?
The errors are self-evident, missing NAMESPACE entries, so I'll merge the PR so far, work on the localC
branch, then merge that branch to main
if successful.
main
for further analysis and testing, including vulnerabilities for large counts of neighbours and large numbers of observations, as well as examples/testing across methods.Appreciate your assistance and guidance with this!
Thanks @rsbivand :)
@geofly1985 that change is not part of this pull request. Can you please make an issue instead?
This pull request creates two functions
localC
andlocalC_perm()
. This implements the local geary statistics as described by Anselin 1995.I confirmed the calculation of the Local Geary with
rgeoda
.localC_perm()
calculates a pseudo p value as described by the Geoda Center (source). To note the pseudo p values didnt as neatly align with geoda. However I am rather (though no completely) confident in my calculation of the p values.I've also done my best to align the output object with other functions that have been implemented by spdep utilizing attributes.