prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.08k stars 5.39k forks source link

Integrate Glue Data Catalog with Hive connector #8786

Closed mbeitchman closed 6 years ago

mbeitchman commented 7 years ago

I would like to contribute the integration with the Glue Hive Metastore compatible service.

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/glue/AWSGlue.html https://aws.amazon.com/glue/faqs/

Should this be an implementation of the following interface?

https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/metastore/ExtendedHiveMetastore.java

swaranga commented 7 years ago

mbeitchman@ were you able to get this done?

mbeitchman commented 7 years ago

@swaranga It's almost done. Should be submitting it soon.

mostafazh commented 7 years ago

Would love to see this merged too @mbeitchman ... If anything I can help with, please let me know :smile:

dmmiller612 commented 7 years ago

@mbeitchman In the same boat. I can help with anything if you would like to distribute the work.

rentaow commented 7 years ago

Next EMR release will have Glue Presto support. In the mean time, I will try to submit a design/pr in the near future.

mostafazh commented 7 years ago

As @rentaow promised :+1: http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html#emr-presto-glue :tada:

rentaow commented 6 years ago

I've linked my doc outlining the proposed changes for contributing this feature. I'd appreciate if I could get some feedback. I will work on submitting a PR if we are on same page with this.

https://docs.google.com/document/d/1Q0GwDsauqzGei_TD5fL4yvwK1knCMtMOvHknut668Co/edit?usp=sharing https://groups.google.com/forum/#!topic/presto-users/zA93pHYDs7c

rentaow commented 6 years ago

Addressed by #9934

mbeitchman commented 6 years ago

Awesome work @rentaow ! Let's get a committer to take a look at the PR. Have you sent a message on the slack channel?

findepi commented 6 years ago

https://github.com/prestodb/presto/pull/9934 is merged.

scraly commented 5 years ago

Have you tested this Glue integration through a Presto deployed outside of a AWS EMR cluster? Thx

findepi commented 5 years ago

@scraly you can use Glue for example with Starburst Presto (https://www.starburstdata.com/presto-aws-cloud/), which is not based on AWS EMR.

scraly commented 5 years ago

Thanks, I am watching it in some slides right now. I'll take a look.

scraly commented 5 years ago

This starsbust presto version is only if we want deploy a presto in AWS but what about deploying a presto in a kubernetes cluster (in a docker container)?

thanks

findepi commented 5 years ago

@scraly that would work too -- just as with any other Presto build. Starburst Presto is in no way limited to AWS.