prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.76k stars 5.29k forks source link

One connector to read Delta Lake and Hive tables #17674

Open dnskr opened 2 years ago

dnskr commented 2 years ago

Problem I use Presto to read hive (parquet, csv etc.) and delta tables which are registered in one hive metastore instance. I need to have both Hive and Delta Lake connectors in Presto to be able to read the data. It is required because Hive connector doesn't have support for delta tables and Delta Lake connector doesn't allow to read hive tables.

Desired behavior It would be very convenient and useful to use one connector instead.

Possible solutions

  1. Delta Lake connector should delegate work to hive connector if non-delta table is being read
  2. Build in Delta Lake connector to Hive connector and use it for delta tables
  3. Implement an umbrella connector which delegates the work to the appropriate Hive or Delta connector
vkorukanti commented 2 years ago

@dnskr We could add support for redirect from Delta Lake connector to Hive connector if the table is not a Delta Lake table. This is transparent to the user. Trino already has this feature (docs here)

dnskr commented 2 years ago

The following also relates to the https://docs.starburst.io/starburst-galaxy/sql/great-lakes.html

pratyakshsharma commented 1 year ago

cc @agrawalreetika