georust / geo

Rust geospatial primitives & algorithms
https://crates.io/crates/geo
Other
1.56k stars 199 forks source link

Implement TryFrom<String> and TryFrom<&str> for Well-known Text for all types #820

Open EvanCarroll opened 2 years ago

EvanCarroll commented 2 years ago

It seems to me that most obvious choice of a default implementation of TryFrom will go Well-known text.

I would suggest accepting a syntax such that one could do

Point::try_from( "POINT(105.85662961006165 21.022381663919713)" )?;

And get a respective point. This would make it a lot easier to integrate with third parties without having to learn about the ecosystem. If the type of the Well-known text doesn't match the strict type, we can just return an invalid input error. Could also implement,

Geometry::try_from( wkt )?;

And only reject if the wkt was invalid or we didn't yet support some subset of the types that can be represented in WKT.

urschrei commented 2 years ago

Perhaps relevant to your interests @EvanCarroll: https://github.com/georust/wkt/pull/95

michaelkirk commented 2 years ago

easier to integrate with third parties without having to learn about the ecosystem.

I agree that eco system discovery is a bit of a challenge.

I wonder if having Wkt as the "blessed" string representation of geometries might be confusing though. There are other equally popular alternatives.

Another option would be to build in some common serializations into geo (perhaps behind feature flags?, perhaps enabled by default?).

Point::try_from_wkt_str("POINT(105.85662961006165 21.022381663919713)")
Point::try_from_geojson_str(r#"{ "type": "Point", "coordinates": [125.6, 10.1] }"#)
michaelkirk commented 2 years ago

Also maybe related: #564

urschrei commented 2 years ago

I wonder if having Wkt as the "blessed" string representation of geometries might be confusing though. There are other equally popular alternatives.

I would go slightly further and say that I don't think its appropriate for Wkt to be privileged in what is ostensibly intended to be a set of high-performance geometry libraries.

I think having common serialisations available by default (in the case of Wkt the dependency overhead is minimal) and some optional (geojson's are heavier) is definitely worth thinking about and discussing, though.

EvanCarroll commented 2 years ago

@urschrei why would it impact performance to implement a trait? You realize if you don't use it, it doesn't even get included in the final build artifact. IMHO whether or not Wkt should be privileged should be a function of it's popularity. It's the only format supported everywhere.

michaelkirk commented 2 years ago

@urschrei didn't assert that it would impact performance to implement a trait. And he can correct me if I'm wrong, but if his concern is similar to mine, it's a concern about introducing an API that implies that WKT is the way to represent a geo-type.

Specifically (and I'm repeating myself),

Rather than:

Polygon::try_from("POLYGON((0 0, 1 1, 2 2, 0 0))")

I'd prefer to have something like:

Polygon::try_from_wkt_str("POLYGON((0 0, 1 1, 2 2, 0 0))")

Do you see a problem with that @EvanCarroll?

EvanCarroll commented 2 years ago

No, it wouldn't be my personal preference but it would work. I don't understand the push back: GeoJSON is not an alternative to WKT, it supports only subset of geographies and it's not supported everywhere. It also requires wrapping the geometry and supports things like properties (which have nothing to do with the point). I think it would be much better to have all the geographies .as_string() to wkt, and TryFrom to pull them in.

Not to put too much pressure on it, but most systems provide this implicit cast (postgresql)

# SELECT 'POINT(0 0)'::geography;
                     geography                      
----------------------------------------------------
 0101000020E610000000000000000000000000000000000000
(1 row)

And, MySQL where it's explicit just knows it as text.

SELECT PointFromText('POINT(0 1)');

It's also the display format for GeoPandas.

michaelkirk commented 2 years ago

One of my reservations is that Wkt and geo-types do not map one to one.

For example, how do you represent a geo-types::Triangle and geo-types::Rect as WKT? In the wkt crate, we treat them as a polygon. This works for geometric processing, but I think it's undesirable that "the canonical string representation" of geo-types would be lossy in this way.

The inverse is also true.POINT EMPTY, which is valid WKT, cannot be expressed as a Geometry<T>.

These are a couple of reasons why I don't think WKT makes sense to be the blessed string representation of geo-types.

edit: WKT actually supports Triangles (but not Rect)! (edit edit: triangle support is mixed)

urschrei commented 2 years ago

You realize if you don't use it, it doesn't even get included in the final build artifact.

Please assume a minimum of knowledge on my part about what rustc does. It will make for higher-quality discussion.

it's a concern about introducing an API that implies that WKT is the way to represent a geo-type

Correct!

but most systems provide this implicit cast (postgresql)

GEOS (which geo-types is far closer to than PostGIS et al) provides the WKTWriter interface if you want the textual representation of a geometry. We lose nothing by implementing wkt methods – we can always have to_string defer to them down the line if a convincing consensus emerges in favour of that.

IMHO whether or not Wkt should be privileged should be a function of it's popularity. It's the only format supported everywhere.

I think we all agree about its popularity, but adopting it as the default textual representation doesn't follow from that. All I – and I think @michaelkirk – are proposing here is a slightly conservative approach, since the problem space hasn't been fully explored yet. It's an approach that allows us to make an additive, non-breaking change down the line if we decide that WKT is the correct blessed textual format.

rmanoka commented 2 years ago

A couple of pros to adopting wkt for Display/Debug on the geoms (mostly personal opinions):

  1. WKT (for geoms) seems to be a a unique readable representation that's solely focussed on the geometry, and not extra properties, CRS, etc.
  2. currently our debug output of the geoms is very complicated to read/understand. Even for a simple Triangle line-string.
  3. The name clearly says it is "Well Known" :)
michaelkirk commented 2 years ago

Upon reflection, it probably does make more sense to just use WKT as our debug format rather than inventing our own weird similar but different thing like I proposed in #564.

but I'm a little loathe to add the WKT crate as an external dependency for anybody using geo-types. I'm not 100% allergic, but feeling a little cautious about adding that baggage/churn to everybody.

A couple of options that come to mind:

michaelkirk commented 2 years ago

re-implement WKT writing directly in geo-types (outputting WKT is simpler than parsing it), without read support as our debug format.

Another thing to consider is that currently the WKT crate concerns itself with the commonly compatible subset of WKT. E.g. a geo_types::Line is rendered by the wkt crate as a LineString: https://github.com/georust/wkt/blob/main/src/geo_types_to_wkt.rs#L73

Similarly the WKT serializes Rect and Triangle as POLYGON(...).

I think that's a reasonable compromise for a crate intended for interop with other spatial libs, but I don't think that would be acceptable for Debug output.