linkedin / coral

Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
BSD 2-Clause "Simplified" License
783 stars 184 forks source link

[Coral-Trino] Migrate cast transformation from RelNode layer to SqlNode layer #491

Closed KevinGe00 closed 5 months ago

KevinGe00 commented 7 months ago

What changes are proposed in this pull request, and why are they necessary?

The goal of this PR is to migrate the logic of visitCast from RelNode layer (in Calcite2TrinoUDFConverter.convertRel which takes a calcite rel and transforms it to trino compatible rel) to the SqlNode layer as part of Coral IR v2. Some important changes to make this happen:

  1. New CastOperatorTransformer class which is a sqlNode transformer that handles Calcite to Trino cast operator transformation.
  2. Since CastOperatorTransformer requires RelType derivation, which requires Calcite validation, which does not preserve laterals (known and chosen behavior), we had to override this behavior as some queries which had casts and lateral joins were having their laterals incorrectly dropped as an unwanted side effect of CastOperatorTransformer.
  3. Calcite2TrinoUDFConverter deleted since there are no more transformations in the class left

How was this patch tested?

gradle clean build regression tests for trino, avro and spark all passing

KevinGe00 commented 5 months ago

Looks like there is no direct relationship between this PR and LATERAL removal. Does it makes sense to add test cases for LATERAL in this PR or a separate PR?

The goal here is to have no test regressions caused by dropped LATERALs since we know that would be semantically incorrect. There's no need to add any additional tests in this PR because if the current tests pass, we know that the LATERAL was added back correctly by https://github.com/linkedin/linkedin-calcite/pull/98. Therefore we are already testing LATERAL in Coral.

Also, I recall we had Coral-side code that reintroduces the LATERA before https://github.com/linkedin/linkedin-calcite/pull/98. Do you know why it does not show in the LHS version of this PR?

We introduce LATERAL in RelToTrinoConverter.convertToSqlNode which happens before CastOperatorTransformer, where LATERAL is then dropped again due to type derivation.

wmoustafa commented 5 months ago

we know that the LATERAL was added back correctly by https://github.com/linkedin/linkedin-calcite/pull/98

I was suggesting to add Coral-side tests for the same issue. Could be inspired by the failing regression integration tests.

Also, I recall we had Coral-side code that reintroduces the LATERAL before https://github.com/linkedin/linkedin-calcite/pull/98

I was referring to code that @aastha25 introduced to manually reinstate LATERAL after validation. I think it was in TypeValidationUtil if I recall correctly. The reason I am asking is that now that we appropriately handle LATERAL in calcite, this reinstating code is no longer needed and should appear as deleted in the LHS of the PR diff.

KevinGe00 commented 5 months ago

I was suggesting to add Coral-side tests for the same issue. Could be inspired by the failing regression integration tests.

We already have Coral-side unit tests that capture this issue. As long as the test input query as a CAST operator and a LATERAL, the missing LATERAL post-validation issue will be tested.

to manually reinstate LATERAL after validation. I think it was in TypeValidationUtil if I recall correctly

I believe you're talking about the post processor @aastha25 wrote a long time ago to manually add the LATERAL back. This code was never commited/merged in the first place since the patch was a WIP so it wouldn't need to be deleted (deleted in the LHS of the PR diff).

KevinGe00 commented 5 months ago

@wmoustafa Addressed the last 2 javadoc comments. Thanks for the thorough review :)

aastha25 commented 5 months ago

Could you also add in the PR description that this patch has been tested against the li-trino-hotfix branch?

rest LGTM.