georust / geo

Geospatial primitives and algorithms for Rust
https://crates.io/crates/geo
Other
1.51k stars 195 forks source link

Initial geo-traits implementation #1011

Closed kylebarron closed 1 year ago

kylebarron commented 1 year ago

As discussed on the discord, in https://github.com/georust/geo/discussions/838, and originally in https://github.com/georust/geo/issues/67, this is an initial bare-bones implementation of geometry access traits for discussion. As @frewsxcv suggested, for now this is a separate crate in the workspace to let us iterate on some ideas.

Future work

Minimize cloning in the geo-types implementation

I got lost in lifetime errors so to get things to compile I added cloning to the geo-types implementations. I.e. for MultiPoint each point is cloned when accessed.

michaelkirk commented 1 year ago

Thank you for moving the traits discussion forward with some concrete code! I think these proposed trait implementations looks reasonable in the abstract. I'd love to see how some algorithms can be implemented in terms of these traits, and what kind of changes/concessions/benefits arise when rubber meets the road.

I think seeing how they can be used will be a necessary measure of how "good" these trait definitions are.

kylebarron commented 1 year ago

I'd love to see how some algorithms can be implemented in terms of these traits, and what kind of changes/concessions/benefits arise when rubber meets the road.

I agree things will become clearer with code implementation. Should I make a new stacked PR to iterate or add edits to this PR?

frewsxcv commented 1 year ago

My vote would be to merge as-is, and then iterate on it with subsequent PRs. Then it's easier for others to collaborate too

kylebarron commented 1 year ago

Any changes I should make in this PR? Unclear what should be discussed further; maybe add rect/line/triangle?

frewsxcv commented 1 year ago

Let's wait to hear what @michaelkirk thinks! If he approves of the plan then yeah we can merge it

michaelkirk commented 1 year ago

My concern is that we may be putting the cart before the horse a bit. My understanding of this work is that ultimately the purpose of it is to be able to define algorithms in terms of these traits. geo-types would be one implementor, but people might want to bring their own types which are better suited for their use cases for whatever reasons, and write their own impl geo_traits::* for MyGeometryType - like an Arrow backed geometry. Right?

I don't think we need to come up with a perfect and complete design before merging. I'm saying "show me that the code you want to merge works at all". Yes it compiles, but if the point of these traits is to be able to implement algorithms, then lets see what it looks like to implement an algorithm with them.

My hope is that if this is a reasonably good design for geometry traits, that this will not be an unreasonably hard task. If it turns out to be hard to do, then maybe this is not a good design for a geometry traits and we all will learn something.

frewsxcv commented 1 year ago

@michaelkirk For the purposes of a proof-of-concept, do you see the implementations you desire living in geo or geo-traits?

michaelkirk commented 1 year ago

I was assuming our existing algorithms in geo would be modified to work in terms of geo-traits where possible.

Did you have something else in mind?

michaelkirk commented 1 year ago

For the purposes of a proof-of-concept

I guess I'm a little uncertain about the usage of "proof of concept". I think that if there is code in georust/geo we should expect that people will use it.

If we want a place to experiment with crazy stuff that we don't want people to use I don't think it should live in this repository.

frewsxcv commented 1 year ago

I think that if there is code in georust/geo we should expect that people will use it.

Curious where this concern comes from! And how them using unpublished code would affect the lives of us as maintainers. As long as we are unpaid volunteers supporting their efforts, we don't owe them any guarantees about how we run our work. Unlike many technical decisions, I feel pretty strongly about unpaid labor haha

frewsxcv commented 1 year ago

I was assuming our existing algorithms in geo would be modified to work in terms of geo-traits where possible.

Did you have something else in mind?

I ask because if they live in the geo/ directory then that muddies the water in terms of separating our draft code and keeping everything in the geo/ directory deployable.

michaelkirk commented 1 year ago

I agree that we should keep geo/ deployable at all times (modulo a version bump in Cargo.toml or whatever).

Can you elaborate on how you'd like to see geo-traits development proceed @frewsxcv? All I can tell is that you seem to not be quite on board with my vision for it. 😅

frewsxcv commented 1 year ago

I see us iterating on it in a separate crate directory akin to the other crate directories. If a pull request change to geo-traits is an improvement, we approve it and merge it in. There's no need to increase the barrier to contribution with a separate repository or ensuring the full vision is there. We can do that when we're ready to publish.

b4l commented 1 year ago

Alternatively, we could agree to work on a traits branch analogous to a development branch for a while until we figure out if it is the way to go. There we would also have the liberty to directly refactor the existing algorithms and put the code in the right place and have a nice diff. Mostly working with 3d data nowadays and the stalled effort to introduce another coordinate dimension in geo-types prevents me from using it which I find quite unfortunate. Just providing options here and hope we can find space to be bold at times and traits may as well offer some possibility to support somehow arbitrary dimension support.

Edit: Also +1 for merging the coord and point traits as long as they don't diverge and what about using std::ops::Index instead of 'point(i: usize)` and the likes to access individual members of multi types?

michaelkirk commented 1 year ago

Another concern I have is less objective, but subjectively, I suspect my position as a maintainer gets easier (and thus more delightful) when there are clear delineations between "this code is thought to work" vs. "this code is a prototype, and I'm not sure to what extent it is intended to work". I'm willing to accept that this is an idiosyncrasy founded on my own personal trauma doing open source development on The Internet (🤣😭) and may not be relevant to other people, but it's still my preference.

I see us iterating on it in a separate crate directory akin to the other crate directories.

Ok, I'm with you here, but more specifically, if we merged these trait definitions, what's next? I'm assuming implementing some algorithms with these traits would be the next step (right?).

So how would we go about that? My best guess is you mean we'd be duplicating the existing algorithms from geo into geo-traits and editing that copy to work with geo-traits instead of geo-types? Or did you intend something else?

we could agree to work on a traits branch analogous to a development branch for a while until we figure out if it is the way to go.

(Thanks for chiming in @b4l!)

I think long running branches can be fraught, typically because they can get out of sync, but I do think it's preferable to duplicating/re-implementing the algorithms in geo-traits. At least with a long running branch, version control can give us some hints as to where conflicts arise (and hopefully it won't be that long running anyway, right? 😬). The alternative of duplicating the implementations means we are relying on our own fragile minds to reason about/remember when conflicts occur.

There we would also have the liberty to directly refactor the existing algorithms and put the code in the right place and have a nice diff.

👍👍 Having a diff against the current geo algorithm implementation is the primary thing I'm looking for when trying to evaluate how "good" these trait implementations are.


What do other people think about merging traits work into a traits branch until we're comfortable releasing something usable and relevant to geo?

I'd especially like to hear from @kylebarron, who seems most affected by this decision in the near term.

kylebarron commented 1 year ago

I'm happy to have this merge onto a traits branch, and ongoing PRs can have that as the base branch. We could also have an ongoing open draft PR from the traits branch to the main branch to make it easy for people to see and comment on the state of the two branches at large

rmanoka commented 1 year ago

This an excellent contribution; thanks @kylebarron ! I'm also quite in favour of exploring the trait-based approach for geo. Specifically w.r. to this PR, I would also prefer we merge this as-is/quickly into a separate branch until we identify a plan to switch to it (either incrementally or in one release). This would help have smaller PRs to help push this out.

Particularly we should try to understand:

  1. Port effort. Is the changes needed per algo. straight-forward or needs knowledge of algo? Do we need to port all algos in one release or can it be done incrementally?
  2. Dependencies. geo-traits abstracts over geo-types, and hence the new dependency chain is: geo -> geo-traits -> geo-types. In the long run, geo won't even directly depend on geo-types. But, are there any dependency issues to understand? Will every breaking bump of geo-traits need bump of geo? This is fine, but would we then have to impl. the algos for different versions of geo-traits via complicated macros?
  3. Performance. Would be good to benchmark some algos on this approach to ensure the compiler understands that we're just wrapping around [f64; 2] or similar.

Pls. add more, and we could track this in the PR that merges the traits branch to main. Reg. @michaelkirk valid concern on the feature branch diverging: since the traits are just wrapping types, at least until we finalize the details of the traits, only changes in geo-types should affect the difference between the branches. So, may be worth the risk?

urschrei commented 1 year ago

(apologies for the drive-by comment) I also prefer the idea of a separate branch for now. Yes, there's some overhead to keeping it in sync, but as has been pointed out it hopefully won't be too long-running.

frewsxcv commented 1 year ago

I have a preference for merging into main, but I am also okay with a separate traits branch.

frewsxcv commented 1 year ago

I just updated the base branch of this pull request to be a new traits branch. Anyone else have blockers before merging?

kylebarron commented 1 year ago

Seems fine on my side; thanks for making the branch!

frewsxcv commented 1 year ago

bors r+

🎉 🎉 🎉 🎉 🎉

bors[bot] commented 1 year ago

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here. For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

michaelkirk commented 1 year ago

🥳 Thanks for your flexibility @frewsxcv.