prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.89k stars 5.32k forks source link

Geometry should be a first-class data type #19449

Open vcschapp opened 1 year ago

vcschapp commented 1 year ago

What?

Please expand the Presto SQL language Data Types to include a native geometry type.

Why?

Background

Some of Presto's supported engines do, or could, support geometry as a first-class type, for e.g. Amazon Redshift (data types). PostgreSQL would if PostGIS is installed. (See issue #15326 - Add WKB Support for PostGIS Geometry Columns). It appears BigQuery has something equivalent, also MySQL (data types); SQL Server (data types) and likely other underlying engines.

As well, even if they do not support geometry types today, some other systems like Iceberg and Hive likely will support it in the future as the GeoParquet format gains in popularity. (OK, hopefully gains in popularity!)

Use Cases Enabled

  1. As a Presto user, I want to be able to query the underlying engine's native geometry column without having to wrap the column in a spatio-temporal (ST_*) function to convert it to geometry.
  2. As a Presto user making queries to a data lake based on Hive, Iceberg, or similar, I want the query optimizer to be able to use geospatial metadata in columnar object files (like GeoParquet files) to intelligently skip objects that don't have any rows which match my spatial query.

Related

tdcmeehan commented 1 year ago

We could start by trying to include support in JDBC connectors that do support this, including Postgres with PostGIS. Please let me know if you would like to work on this.

Note that there is a Geometry data type, we just don't have Presto configured to directly read it from data sources, it first needs to be translated from binary or text via one of the relevant geospatial functions. This should probably be updated in our documentation, we would also welcome to accept a pull request which adds this.

vcschapp commented 1 year ago

Thanks for the reply!

I'm currently "spending my OSS credits" on building an implementation of FlatGeobuf for GoLang (very much under construction), but assuming I can complete that little project with sanity intact, I'd be interested in contributing something here that helps get the ball rolling.

majetideepak commented 1 year ago

CC: @mbasmanova

mmerdes commented 3 months ago

Any news on this?

tdcmeehan commented 3 months ago

@mmerdes Presto already has support for Geometry types and can map to and from WKB and WKT. In a sense, it's already a first-class data type. What's missing are native integrations to underlying connectors. Is there a particular connector you would like to see this in?