Closed andygrove closed 1 week ago
We currently delegate to DataFusion when casting from floating point to integer types and there are some differences in behavior compared to Spark.
Here is an example test from CometCastSuite:
CometCastSuite
test("cast float to int") { castTest(generateFloats, DataTypes.IntegerType) } private def generateFloats(): DataFrame = { val r = new Random(0) val values = Range(0, dataSize).map(_ => r.nextFloat()) ++ Seq(Float.MaxValue, Float.MinPositiveValue, Float.MinValue, Float.NaN, Float.PositiveInfinity, Float.NegativeInfinity, 0.0f, -0.0f) values.toDF("a") }
Here are differences between Spark and Comet output:
== Results == !== Spark Answer - 1008 == == Comet Answer - 1008 == struct<a:float,converted:int> struct<a:float,converted:int> ![-3.4028235E38,-2147483648] [-3.4028235E38,null] ![-Infinity,-2147483648] [-Infinity,null]
No response
Can I give this one a try?
What is the problem the feature request solves?
We currently delegate to DataFusion when casting from floating point to integer types and there are some differences in behavior compared to Spark.
Here is an example test from
CometCastSuite
:Here are differences between Spark and Comet output:
Describe the potential solution
No response
Additional context
No response