For the other integer types, this should be easy to handle with a cast and use the existing put_long bloom filter method to match Spark behavior. For strings, the underlying bloom filter implementation needs a put_bytes method to match Spark's bloom filter behavior.
What is the problem the feature request solves?
https://github.com/apache/datafusion-comet/pull/987 introduces native
BloomFilterAgg
with support forLongType
, as in Spark 3.4. Spark 3.5+ added support for other integer types, and strings.Describe the potential solution
For the other integer types, this should be easy to handle with a cast and use the existing
put_long
bloom filter method to match Spark behavior. For strings, the underlying bloom filter implementation needs aput_bytes
method to match Spark's bloom filter behavior.Additional context
No response