Closed u3ks closed 1 month ago
Just to add some context to this. As we are refactoring momepy, we realised that we rely very often on this internal function, which is fairly generic and shall be tied directly to Graph.
The idea behind the q
limiting the range is coming from morphology. We often want to get some sort of a spatial average but given the high likelihood of outliers (think of a church in the middle of a neighborhood), we can't include all the values within each neighborhood.
Attention: Patch coverage is 97.82609%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 85.1%. Comparing base (
bcabdbc
) to head (879f3f5
). Report is 18 commits behind head on main.
i think i would call this describe_cardinalities
or something because "Graph.describe() to describe neighbourhood values" implies we're looking at the neighbor values
But this is not describing cardinalities, no? Where cardinality is a number of elements in a set. It is describing distribution of values within a neighbourhood.
oh i see. It was this note on line 2014 that tripped me up:
'Weight values do not affect the calculations, only adjacency does.'
This PR adds a method to the graph api which takes an array of values and calculates descriptive statistics within each neighborhood. Optionally, some neighbors can be filtered out based on the percentiles of the passed values. The supported stats are - "count", "mean", "median", "std", "min", "max", "sum", "nunique" and "mode".
The method similar to .apply, but all values are calculated in one grouping operation and all functions are jitted.