georust / geo

Geospatial primitives and algorithms for Rust
https://crates.io/crates/geo
Other
1.53k stars 198 forks source link

Add MeanCenter trait #1166

Open JosiahParry opened 6 months ago

JosiahParry commented 6 months ago

This PR adds a new trait MeanCenter which calculates the euclidean mean center of a set of coordinates. It is implemented for all geometry types.

lnicola commented 6 months ago

I think this is normally called a centroid.

JosiahParry commented 6 months ago

There are subtle differences between centroids and the mean center of point sets. I've added some tests that compare the centroid for each geometry with the mean center. There are differences, notably for a polygons.

Here's a blog post that (briefly) describes its utility. I was planning on adding a fn weighted_mean_center() as well as a follow up. I can add that to this PR if you think it would make the trait more motivating .

urschrei commented 6 months ago

There are subtle differences between centroids and the mean center of point sets. I've added some tests that compare the centroid for each geometry with the mean center. There are differences, notably for a polygons.

Here's a blog post that (briefly) describes its utility. I was planning on adding a fn weighted_mean_center() as well as a follow up. I can add that to this PR if you think it would make the trait more motivating .

It's obviously an automatic approval from me because the blog post uses Irish CSO data, but I'd love to see the weighted mean centre method too.

JosiahParry commented 6 months ago

🤣 heard. I've already pushed the weighted_mean_center() method and doc changes. I just need to add tests which I'll find time for tonight or tomorrow. 🫡

JosiahParry commented 6 months ago

I think this is normally called a centroid.

Yup, you're right! OOPS! 🙈

The mean center for MultiLineString, MultiPolygon, and GeomtryCollection is where this differs. Since this is a point pattern technique, there must be one point per feature. Each geometry's centroid is used to represent it. Then the mean center (centroid) of those is used.

This technique is most useful for feature collections. For example, finding the weighted mean center of 311 reports grouped by category (MultiPoint per category) and the weight would be days since report or estimate cost to fix etc.

Related doc from arcgis implementation

I've updated the trait doc and implementation to reflect this.

I will need to rewrite the tests, though! Thank you for your help :)