When debugging Parquet files I found that it can be useful to see the column statistics (if available), e.g. when trying to figure out if row group filtering is happening in Spark.
Would be great if we could print statistics (min, max, nulls, maybe column order) in addition to all of the metadata the command already displays.
When debugging Parquet files I found that it can be useful to see the column statistics (if available), e.g. when trying to figure out if row group filtering is happening in Spark.
Would be great if we could print statistics (min, max, nulls, maybe column order) in addition to all of the metadata the command already displays.