pysal / esda

statistics and classes for exploratory spatial data analysis
https://pysal.org/esda
BSD 3-Clause "New" or "Revised" License
206 stars 53 forks source link

[WIP] add spatial correlogram function #259

Open knaaptime opened 11 months ago

knaaptime commented 11 months ago

add first draft of correlogram

knaaptime commented 11 months ago

supercedes #253

knaaptime commented 11 months ago

there's nothing harder than pulling off a successful rebase. I'll die on this hill.

jGaboardi commented 11 months ago

there's nothing harder than pulling off a successful rebase. I'll die on this hill. 5589648

knaaptime commented 11 months ago

this should be just about good to go, but it's still failing for stuff like join counts because the class stashes all sorts of things as attriubutes (like the W and its adjacency list) that cant get serialized by joblib, and LOSH/Spatial_Pearson which have a different signature that needs w.sparse

knaaptime commented 11 months ago

Smart. The issue I was runnning into was that w.to_adjlist sorts the neighbors, so once you convert and try to subset the W, it and df are no longer aligned, which is a mess

--

Elijah Knaap, Ph.D. Senior Research Scientist & Associate Director Center for Open Geographical Science San Diego State University knaaptime.com | @knaaptime On Aug 6, 2023 at 1:47 AM -0700, Martin Fleischmann @.***>, wrote:

@martinfleis commented on this pull request. In esda/correlogram.py:

  • with NotImplementedError as e:
  • raise e("Only I, G, and C statistics are currently implemented")
  • if distance_type == "band":
  • W = DistanceBand
  • elif distance_type == "knn":
  • if max(distances) > gdf.shape[0] - 1:
  • with ValueError as e:
  • raise e("max number of neighbors must be less than or equal to n-1")
  • W = KNN
  • else:
  • with NotImplementedError as e:
  • raise e("distance_type must be either band or knn ")
  • should be able to build the tree once and reuse it?

  • but in practice, im not seeing any real difference from starting a new W from scratch each time

    To keep the structure, you can just assign weight=0 to everything above the set distance threshold. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

codecov[bot] commented 8 months ago

Codecov Report

Merging #259 (c643524) into main (f6a8732) will decrease coverage by 0.2%. The diff coverage is 55.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main    #259     +/-   ##
=======================================
- Coverage   73.4%   73.2%   -0.2%     
=======================================
  Files         24      25      +1     
  Lines       3285    3325     +40     
  Branches     518     527      +9     
=======================================
+ Hits        2411    2433     +22     
- Misses       708     721     +13     
- Partials     166     171      +5     
Files Coverage Δ
esda/__init__.py 100.0% <100.0%> (ø)
esda/correlogram.py 53.8% <53.8%> (ø)
knaaptime commented 7 months ago

what other tests do folks think are necessary for this?

ljwolf commented 5 months ago

I think at least:

  1. verify that the documented association measures work as expected
  2. verify that a simple callable completes its computation

Then I think it's done!

knaaptime commented 5 months ago

cool. I think 1 is covered by the existing test using I? just need to add a generic callable test for 2?

ljwolf commented 5 months ago

covered by the test using I

Ah ok, in light of the previous discussion, fine! I meant that we should test all of the estimators we specify to ensure correctness.

Yes on the generic callable test!

Then, I think the last bit is just the test failures?