google / s2geometry

Computational geometry and spatial indexing on the sphere
http://s2geometry.io/
Apache License 2.0
2.29k stars 302 forks source link

S2RegionTermIndexer: Add access to coverer #299

Closed MBkkt closed 3 months ago

jmr commented 1 year ago

Can you explain more about why you want this? I don't quite understand.

MBkkt commented 1 year ago

@jmr

Here I need to create external S2RegionCoverer:

https://github.com/arangodb/arangodb/blob/d37f3d17f69dd388464cd509b749c771f01e1c55/arangod/IResearch/GeoAnalyzer.h#L73

Because for some S2Region derived class (our own, not from s2) I know faster way to compute covering.

https://github.com/arangodb/arangodb/blob/d37f3d17f69dd388464cd509b749c771f01e1c55/arangod/IResearch/GeoAnalyzer.cpp#L389

jmr commented 1 year ago

Exposing the mutable coverer definitely makes me nervous. Is this necessary at all? Could you get the options and make a coverer from that?

MBkkt commented 1 year ago

Could you get the options and make a coverer from that?

Of course I can, I send code where I do it.

But it's worse, instead of using internal S2RegionCoverer I need to maintain external with same settings.

And it's even worse if I want to use both

jmr commented 1 year ago

Do you need mutable_coverer? Having only coverer access doesn't seem that bad, but we'd want to consider adding to Java for consistency. @TorreyHoffman @eengle

Aren't we really just talking about the difference between something(indexer.coverer()) and something(S2RegionCoverer(indexer.options()))?

MBkkt commented 1 year ago

Aren't we really just talking about the difference between something(indexer.coverer()) and something(S2RegionCoverer(indexer.options()))

@jmr

Not really? Because get covering is not const.

My main example is "set" of polylines.

It's completely simpler make covering for each polyline, and then canonicalize it.

Instead of implement all S2Region virtual methods for this "set" effectively (probably needs to create shape index, which creation is very expensive)

Of course results can differs if we strict number of cells but a little, and it's ok for me, it's just difference about priority of cells.

But it a lot of faster and simpler.

In general I'm talking about get access to internal Indexer Coverer instead of make and maintain external with same settings.

I think api not perfect here, but this change allow to use field in S2RegionIndexer, which otherwise will be completely unused

In general if it will be possible I think better to have ability to create Indexer without coverer

MBkkt commented 1 year ago

In general if it will be possible I think better to have ability to create Indexer without coverer

It can be done with making S2RegionIndexer template (implementation still can be in cpp)

jmr commented 1 year ago

Instead of implement all S2Region virtual methods for this "set" effectively (probably needs to create shape index, which creation is very expensive)

This sounds like it might be a good option. Most of the implementations should be simple. Why do you need a shape index?

MBkkt commented 1 year ago

This sounds like it might be a good option. Most of the implementations should be simple. Why do you need a shape index?

How to implement MayIntersect good for set of polylines (in general geo json multipolyline)?

jmr commented 1 year ago

How to implement MayIntersect good for set of polylines (in general geo json multipolyline)?

return any_of(polylines, S2Polyline::MayIntersect);?

MBkkt commented 1 year ago

It's linear

MBkkt commented 1 year ago

Hmm, maybe it's ok, because S2Polyline:: MayIntersects also linear.

Btw what about set of points (GeoJson multipoints), I think creating cells from points and then Canonicalize it better than run algorithm which needs to make linear search for cells?

jmr commented 1 year ago

It's linear

Yes. Are you implying this will be too slow? It's hard to know from a two-word answer.

How many polylines are we talking about? What's the time difference between covering N polylines and one multi-polyline with N polylines when implemented this way?

It will be helpful you explain what you're trying to do, what alternatives you considered, and why this is the best alternative.

https://google.github.io/eng-practices/review/developer/cl-descriptions.html

MBkkt commented 1 year ago

It will be helpful you explain what you're trying to do, what alternatives you considered, and why this is the best alternative

Or you can just make termindexer more flexible.

It's designed not perfect, as I wrote it contains field which used only in single overload from three.

Same about index in S2Polygon/S2Loop

So it has two option how to improve, get access to the field (make it same as Polygon/Loop .index())

Or (which better) Allow to create Indexer without S2RegionCoverer

What about

S2TermIndexerBase?

Which will be completely same, but without RegionCoverer and two methods: GetIndex/QueryTerms(Region)?

Current S2TermIndexer can be derived from it

So in general it's just splitting the code

MBkkt commented 1 year ago

@jmr What do you think about separation to two classes, one derived another?

jmr commented 1 year ago

@jmr What do you think about separation to two classes, one derived another?

What's the problem you're trying to solve? Cover multiple polylines efficiently? Did you try S2RegionUnion?

MBkkt commented 1 year ago

@jmr I already wrote, sometimes I need both indexer and coverer. I can do it now, but I also will have additional, completely unnecessary coverer in indexer. I think will be nice, not really required, to avoid having it. Solution with inheritance exactly and simple solve this issue.

jmr commented 1 year ago

@jmr I already wrote, sometimes I need both indexer and coverer.

Can you explain it all in one place? Easiest would probably be if you replaced No description provided in https://github.com/google/s2geometry/pull/299#issue-1525962182 with your proposal including what you're trying to do, why you need both an indexer and coverer with the same options, alternatives considered and why they're worse, etc. Showing code will help.

MBkkt commented 3 months ago

s2 made GetCellUnionBound virtual and it fixed this issue for me. Thanks