Closed munendrasn closed 4 months ago
@lzlfred @harperjiang @LukasRupprecht Could you please take a look at this?
Hi @munendrasn, thanks for your question. In order to parse the stats for UniForm and Iceberg, we needed to make a change to Apache Spark (https://github.com/apache/spark/pull/42083). This was merged into master back in August, but we can only use it once it is actually released, in Apache Spark 3.6 or 4.0.
Closing the issue as the stats support has been added in this commit https://github.com/delta-io/delta/commit/3b3d729e931772339d58d200ef130d05cd39466d
Question
While trying out Uniform found the Manifest file created in Iceberg, doesn't contain Column stats like
lower_bound
,upper_bound
,null_counts
. This would have impact on the query latency as column stats are used for pruning. Is Converting the stats from Delta files, and adding it to Iceberg Manifest file in works?Which Delta project/connector is this regarding?
Environment information
Willingness to contribute
The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?