linkedin / transport

A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
BSD 2-Clause "Simplified" License
291 stars 72 forks source link

Upgrade Transport to use API from Trino v406 #128

Closed yiqiangin closed 1 year ago

yiqiangin commented 1 year ago
weijiii commented 1 year ago

Thanks for working on this!

aastha25 commented 1 year ago

could you also add in the testing section that ./gradlew build is successful from the project's root directory and from transportable-udfs-example-udfs subdirectory.

yiqiangin commented 1 year ago

could you also add in the testing section that ./gradlew build is successful from the project's root directory and from transportable-udfs-example-udfs subdirectory.

Done.

weijiii commented 1 year ago

Can you also update StdUdfUtils#RESERVED_KEYWORDS ? This is within scope of upgrading to Trino 406. The new ones should be

private static final Set[String](https://app.slack.com/client/T06BYN8F7/String) RESERVED_KEYWORDS =
      ImmutableSet.of("ALTER", "AND", "AS", "BETWEEN", "BY", "CASE", "CAST", "CONSTRAINT", "CREATE", "CROSS", "CUBE",
          "CURRENT_CATALOG", "CURRENT_DATE", "CURRENT_PATH", "CURRENT_ROLE", "CURRENT_SCHEMA", "CURRENT_TIME",
          "CURRENT_TIMESTAMP", "CURRENT_USER", "DEALLOCATE", "DELETE", "DESCRIBE", "DISTINCT", "DROP", "ELSE", "END",
          "ESCAPE", "EXCEPT", "EXECUTE", "EXISTS", "EXTRACT", "FALSE", "FOR", "FROM", "FULL", "GROUP", "GROUPING",
          "HAVING", "IN", "INNER", "INSERT", "INTERSECT", "INTO", "IS", "JOIN", "JSON_ARRAY", "JSON_EXISTS", "JSON_OBJECT", 
          "JSON_QUERY", "JSON_VALUE", "LEFT", "LIKE", "LISTAGG", "LOCALTIME", "LOCALTIMESTAMP", "NATURAL", "NORMALIZE", "NOT", 
          "NULL", "ON", "OR", "ORDER", "OUTER", "PREPARE", "RECURSIVE", "RIGHT", "ROLLUP", "SELECT", "SKIP", "TABLE", "THEN", 
          "TRIM", "TRUE", "UESCAPE", "UNION", "UNNEST", "USING", "VALUES", "WHEN", "WHERE", "WITH");

I think we should expedite this release so the subsequent testing of our production UDFs can be initiated. (cc Dali reviewers @ljfgem @aastha25 ) Thanks.

yiqiangin commented 1 year ago

Can you also update StdUdfUtils#RESERVED_KEYWORDS ? This is within scope of upgrading to Trino 406. The new ones should be

private static final Set[String](https://app.slack.com/client/T06BYN8F7/String) RESERVED_KEYWORDS =
      ImmutableSet.of("ALTER", "AND", "AS", "BETWEEN", "BY", "CASE", "CAST", "CONSTRAINT", "CREATE", "CROSS", "CUBE",
          "CURRENT_CATALOG", "CURRENT_DATE", "CURRENT_PATH", "CURRENT_ROLE", "CURRENT_SCHEMA", "CURRENT_TIME",
          "CURRENT_TIMESTAMP", "CURRENT_USER", "DEALLOCATE", "DELETE", "DESCRIBE", "DISTINCT", "DROP", "ELSE", "END",
          "ESCAPE", "EXCEPT", "EXECUTE", "EXISTS", "EXTRACT", "FALSE", "FOR", "FROM", "FULL", "GROUP", "GROUPING",
          "HAVING", "IN", "INNER", "INSERT", "INTERSECT", "INTO", "IS", "JOIN", "JSON_ARRAY", "JSON_EXISTS", "JSON_OBJECT", 
          "JSON_QUERY", "JSON_VALUE", "LEFT", "LIKE", "LISTAGG", "LOCALTIME", "LOCALTIMESTAMP", "NATURAL", "NORMALIZE", "NOT", 
          "NULL", "ON", "OR", "ORDER", "OUTER", "PREPARE", "RECURSIVE", "RIGHT", "ROLLUP", "SELECT", "SKIP", "TABLE", "THEN", 
          "TRIM", "TRUE", "UESCAPE", "UNION", "UNNEST", "USING", "VALUES", "WHEN", "WHERE", "WITH");

I think that we should expedite this release so the subsequent testing of our production UDFs can be initiated. (cc Dali reviewers @ljfgem @aastha25 ) Thanks.

I have updated the list per Erik's review before. Just double check the list in the code is the same as the list shown here.

yiqiangin commented 1 year ago

why do we have 4 JAR files checked in? (e.g. transport-udf-1-trino-dist-thin.jar)

Those JARs contains UDF classes which are actually loaded in TransportPluginTest to verify the loading of UDF classes by TransportUDFClassLoader during the initialization of TransportPlugin

xkrogen commented 1 year ago

why do we have 4 JAR files checked in? (e.g. transport-udf-1-trino-dist-thin.jar)

Those JARs contains UDF classes which are actually loaded in TransportPluginTest to verify the loading of UDF classes by TransportUDFClassLoader during the initialization of TransportPlugin

Can we put them inside of a test/resources folder to make it more clear that these are testing resources?

yiqiangin commented 1 year ago

why do we have 4 JAR files checked in? (e.g. transport-udf-1-trino-dist-thin.jar)

Those JARs contains UDF classes which are actually loaded in TransportPluginTest to verify the loading of UDF classes by TransportUDFClassLoader during the initialization of TransportPlugin

Can we put them inside of a test/resources folder to make it more clear that these are testing resources?

It has been done in https://github.com/linkedin/transport/pull/133