apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
447 stars 100 forks source link

feat: Only allow incompatible cast expressions to run in comet if a config is enabled #362

Closed andygrove closed 2 weeks ago

andygrove commented 2 weeks ago

Which issue does this PR close?

Part of https://github.com/apache/datafusion-comet/issues/286

Rationale for this change

We have discovered numerous compatibility issues with the CAST expression, so we should fall back to Spark for any cast operations that we do not fully support to avoid any data corruption or incorrect results.

This PR adds a new spark.comet.cast.allowIncompatible config and implements the mechanism to only allow incompatible casts when this is enabled.

What changes are included in this PR?

Screenshot 2024-05-02 at 1 48 42 PM

How are these changes tested?

Existing tests in CometCastSuite

andygrove commented 2 weeks ago

@vaibhawvipul Please take a look since this makes some changes to your recent code

andygrove commented 2 weeks ago

I plan on waiting until https://github.com/apache/datafusion-comet/pull/346 and https://github.com/apache/datafusion-comet/pull/340 are merged before merging this one, to avoid causing them to rebase again

andygrove commented 2 weeks ago

It looks like #340 will take a little longer, so will emrge this now