trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.18k stars 2.93k forks source link

Add support for YugabyteDB connector #5352

Open ebyhr opened 3 years ago

ebyhr commented 3 years ago

Yugabyte offers JDBC drivers, but it has BETA label for now. https://docs.yugabyte.com/latest/reference/drivers/yugabytedb-jdbc-driver/

Another option is using Java client driver that is based on DataStax's one (similar to the current Cassandra connector). https://docs.yugabyte.com/latest/reference/drivers/ycql-client-drivers/#java

Related to #4836 #4838

Reported issue:

tedyu commented 3 years ago

Suggested change for presto-cassandra-driver to connect to Yugabyte DB:

diff --git a/pom.xml b/pom.xml
index c0c71c2..ebf25d9 100644
--- a/pom.xml
+++ b/pom.xml
@@ -41,11 +41,17 @@

     <dependencies>
         <dependency>
-            <groupId>com.datastax.cassandra</groupId>
+            <groupId>com.yugabyte</groupId>
             <artifactId>cassandra-driver-core</artifactId>
-            <version>3.6.0</version>
+            <version>3.2.0-yb-19</version>
             <scope>runtime</scope>
         </dependency>
+            <dependency>
+                <groupId>com.google.guava</groupId>
+                <artifactId>guava</artifactId>
+                <version>19.0</version>
+              <scope>runtime</scope>
+            </dependency>
     </dependencies>

     <build>
@@ -87,7 +93,7 @@
                             <promoteTransitiveDependencies>true</promoteTransitiveDependencies>
                             <artifactSet>
                                 <includes>
-                                    <include>com.datastax.cassandra:cassandra-driver-core</include>
+                                    <include>com.yugabyte:cassandra-driver-core</include>
                                     <include>com.google.guava:guava</include>
                                     <include>org.ow2.asm:*</include>
                                     <include>com.github.jnr:*</include>
ebyhr commented 3 years ago

@tedyu I don't think we can replace the dependency. We will need to create a new connector.

tedyu commented 3 years ago

Agreed to creating new connector for Presto.

The Yugabyte JDBC driver covers both YCQL (for Cassandra) and YSQL (for Postgres). The Beta label is for some YSQL feature to be added, not for YCQL.

For YCQL, Presto should be able to use Yugabyte JDBC driver.

tedyu commented 3 years ago

@ebyhr Do you mind creating a new connector for Yugabyte DB (via Yugabyte JDBC driver) ?

Thanks

findepi commented 3 years ago

Do you mind creating a new connector for Yugabyte DB (via Yugabyte JDBC driver) ?

Since Yugabyte is different from Cassandra, yes we should have dedicated connector. Does Yugabyte aims for maintaining JDBC and behavior compatibility with Cassandra (also for future Cassandra changes) If yes, we could perhaps consider creating Yugabyte connector which extends Cassandra connector. We should also consider copying the connector -- as we did for eg MemSQL (https://github.com/prestosql/presto/pull/1906), since MemSQL does not apparently aim at maintaining compat with MySQL.

@tedyu do you want to work on this?

@ebyhr @tedyu @electrum let's have agreement before commencing the work.

tedyu commented 3 years ago

Yugabyte does plan to maintain JDBC and behavior compatibility with Cassandra.

The suggested change in https://github.com/prestosql/presto/issues/5352#issuecomment-701493224 references cassandra-driver-core maintained by yugabyte. This is for the Presto Yugabyte connector (to be created in new repo).

tedyu commented 3 years ago

@ebyhr @findepi

Given the change in: https://github.com/prestosql/presto/issues/5352#issuecomment-701493224

Can you help create repository for Presto connector ?

Thanks

findepi commented 3 years ago

@tedyu i am not sure what kind of repository this would be. All the supported connectors are directly in this repository, as maven modules. (https://github.com/prestosql/presto-cassandra-driver exists for technical reasons, this is not a Presto plugin)

tedyu commented 3 years ago

I was expecting https://github.com/prestosql/presto-yugabytedb-driver or something like that to be created.

Thanks

findepi commented 3 years ago

@tedyu presto-cassandra-driver exists because we need to re-package the driver, shading-in some dependencies. Is it also the case for Yugabyte driver? what are its dependencies?

tedyu commented 3 years ago

yb-cassandra-dep.txt

Please see the attached file (up to line 76)

findepi commented 3 years ago

is Yugabyte driver a fork of cassandra driver? i think the guava (v 19) dep can be a problem. OK, i created a https://github.com/prestosql/presto-yugabyte-driver repo.

findepi commented 3 years ago

@tedyu can you please created a companion PR here adding the new module? also, please make sure to sign the CLA

tedyu commented 3 years ago

For the companion PR, wouldn't it be dependent on the availability of presto-yugabyte-driver in maven ?

Signature has been sent to cla@prestosql.io. Is there more to be done in this regard ?

findepi commented 3 years ago

Signature has been sent to cla@prestosql.io. Is there more to be done in this regard ?

cc @martint

For the companion PR, wouldn't it be dependent on the availability of presto-yugabyte-driver in maven ?

the PR won't build on CI until presto-yugabyte-driver is published, but will still allow a reviewer to build both and verify that what we have in https://github.com/prestosql/presto-yugabyte-driver/pull/1 is sufficient

the next step would be to merge https://github.com/prestosql/presto-yugabyte-driver/pull/1, release do maven, etc.

tedyu commented 3 years ago

The following would allow building against presto-yugabyte-driver :

diff --git a/pom.xml b/pom.xml
index a275578596..f3e9794cc7 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1727,6 +1727,18 @@
                 </exclusions>
             </dependency>

+            <dependency>
+                <groupId>com.facebook.presto.yugabyte</groupId>
+                <artifactId>yugabyte-driver</artifactId>
+                <version>3.6.0-2-SNAPSHOT</version>
+                <exclusions>
+                    <exclusion>
+                        <groupId>io.netty</groupId>
+                        <artifactId>*</artifactId>
+                    </exclusion>
+                </exclusions>
+            </dependency>
+
             <dependency>
                 <groupId>com.facebook.presto.pinot</groupId>
                 <artifactId>pinot-driver</artifactId>
diff --git a/presto-cassandra/pom.xml b/presto-cassandra/pom.xml
index ecd26c5512..ae4678dca4 100644
--- a/presto-cassandra/pom.xml
+++ b/presto-cassandra/pom.xml
@@ -17,8 +17,8 @@

     <dependencies>
         <dependency>
-            <groupId>com.facebook.presto.cassandra</groupId>
-            <artifactId>cassandra-driver</artifactId>
+            <groupId>com.facebook.presto.yugabyte</groupId>
+            <artifactId>yugabyte-driver</artifactId>
         </dependency>

         <dependency>

Do you have suggestion other than copying the whole presto-cassandra subproject ?

Thanks

findepi commented 3 years ago

Do you have suggestion other than copying the whole presto-cassandra subproject ?

Let's start simpler:

tedyu commented 3 years ago

yuga-module.txt

See if the patch is on right track.

Currently trying to figure out which Enforcer rule fails:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M2:enforce (default) on project presto-yugabyte: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M2:enforce (default) on project presto-yugabyte: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed.
findepi commented 3 years ago

yuga-module.txt

See if the patch is on right track.

instead of a diff, why don't you make a PR? it will be much easier to work with, comment, etc.

tedyu commented 3 years ago

Even with -X option, I didn't get more clue on which enforcer rule was broken.

Once the compilation passes, I would send out the PR.

Meanwhile, suggestion on how to fix compilation is welcome.

findepi commented 3 years ago

Meanwhile, suggestion on how to fix compilation is welcome.

let's see how it looks like in the PR.

tedyu commented 3 years ago

https://github.com/prestodb/presto/pull/15357

ebyhr commented 3 years ago

@tedyu It seems you sent PR to different repository. Could you resend to prestosql/presto?

tedyu commented 3 years ago

https://github.com/prestosql/presto/pull/5708

ebyhr commented 3 years ago

Yugabyte (CQL) doesn't support setting a comment in both CREATE TABLE and ALTER TABLE. https://docs.yugabyte.com/latest/api/ycql/ddl_create_table/#table-properties-1

This limitation breaks INSERT statement and affects to some operations because Cassandra connector uses the comment to manage hidden columns.

tedyu commented 3 years ago

I logged https://github.com/yugabyte/yugabyte-db/issues/9902 for comment enhancement in YCQL.