pysal / spaghetti

SPAtial GrapHs: nETworks, Topology, & Inference
http://pysal.org/spaghetti/
BSD 3-Clause "New" or "Revised" License
259 stars 69 forks source link

Interpretation of Local Moran's I #527

Open anitagraser opened 3 years ago

anitagraser commented 3 years ago

I'm having trouble wrapping my head around the LISA results. I've created a point pattern which is very localized. Large parts of the network are not covered by any points. However, LL arcs only appear here and there - not nearly as contiguously as I would expect. The same (but maybe not as noticeable) seems to apply to HH arcs.

Why do the results look like this? If there is some statistical reason for this, I'd appreciate any pointers to related reading material.

Here's the full notebook to reproduce this effect: https://github.com/anitagraser/spaghetti/blob/ab16652b78c5bc576a9c3377563510dfc152cf21/notebooks/debug-counts-on-split-arcs.ipynb

image

jGaboardi commented 3 years ago

@anitagraser Thanks for this report and the notebook. It does seem to be that this boils down to the issue you noticed in #526 and there may be something funky going on in Network.split_arcs. Once #526 gets worked out, we can revisit this.

anitagraser commented 3 years ago

@jGaboardi thank you! As far as I can tell, split_arcs is not used (at least not called directly by the user) in this case.

jGaboardi commented 3 years ago

As far as I can tell, split_arcs is not used (at least not called directly by the user) in this case.

Indeed, you are correct. I will give this my fullest attention as soon as I can. (PySAL is not my day job... unfortunately... 😆). In the mean time, would you be able to share the clustered point pattern you created for the debug notebook?

anitagraser commented 3 years ago

would you be able to share the clustered point pattern you created for the debug notebook?

Sure, I've added it to my fork: https://github.com/anitagraser/spaghetti/tree/master/notebooks/data

jGaboardi commented 3 years ago

Anita, you were correct the issue with arc splitting (#526 --> #535) has nothing to do the result you are seeing here. In fact, after further review the LISA plots do appear to be a valid result. Perhaps @slumnitz can chime in here for a better explanation of the LISA plotting and interpretation (also see Exploratory Analysis of Spatial Data: Visualizing Spatial Autocorrelation with splot and esda). As another example, see this gist. Here there are two large synthetic clusters, but I also increased the permutations (see cell 9).

I also plan to add more to the network spatial autocorrelation notebook when I get a chance.

anitagraser commented 3 years ago

Thanks James! Indeed, it would be interesting to discuss the consequences this has for the interpretation of LISA results for point patterns. If a count of zero points per edge is too close to the expected value to be reliably classified as a cold spot, then it is surprising to see LL edges scattered all over. Any hints would be highly appreciated. (cc @slumnitz)

anitagraser commented 3 years ago

This also relates to the following new discussion: https://github.com/pysal/spaghetti/discussions/547

slumnitz commented 3 years ago

Hello @jGaboardi and @anitagraser, sorry for the long silence, I am happy to give this a look but need a couple days to get into the problem. A couple questions to better understand what is going on here: @anitagraser can you point me to the notebook where you do this analysis? I'd like to understand your data and you way of building weights better first. Also which output would you be expecting?

jGaboardi commented 3 years ago

Since this is more of a discussion than an actual bug, I vote we move this to the Discussions board. Any objections @anitagraser @slumnitz?

slumnitz commented 3 years ago

@jGaboardi and @anitagraser, I am not sure yet if this is a bug or not. splot.esda.lisa_cluster has only ever been tested on polygons, not lines. I'd be actually curious to see the Moran_scatterplot that goes with the indicated plot. I's like to know if the LL street paths have a different value to the ns street parts. If they do not have different values, there might be a bug in splot assigning bins/ labels/ colours when the gdf does not contain polygons. If there are different values, we should investigate why they are different, i.e. assigning points to lines, Calculating weights etc. But intriguing :)

jGaboardi commented 3 years ago

Ah, this is very interesting! I will leave this as an issue for now then. Thanks for looking into this @slumnitz!

slumnitz commented 3 years ago

To add, but this might not be an issue here: I am not sure if this might be related, but so far 'splot.esda.lisa_cluster' is only tested if the weights, Morans etc have been calculated from that same gdf that is used for plotting. So if weights/ moran are calculated for points and then plotted with lines the assignment will likely not match.

anitagraser commented 3 years ago

@slumnitz thank you for your insights! The plots above are taken from the following notebook: https://github.com/anitagraser/spaghetti/blob/master/notebooks/debug-counts-on-split-arcs.ipynb

I expected that the LL edges would be rather contiguous, instead of being sprinkled all over the place. I actually can generate something that looks more like what I expected, by splitting the network into 200ft segments:

image

anitagraser commented 3 years ago

@slumnitz have you had a chance to look at the example above to determine if it's a bug or expected behavior?

jGaboardi commented 3 years ago

@anitagraser I closed this too early! Sorry!