Closed osopardo1 closed 3 months ago
Fixes #331
New Feature, no breaking change.
Easy API for computing the histogram for a column. Usage:
import io.qbeast.spark.utils.QbeastUtils val brandStats = QbeastUtils.computeHistogramForColumn(df, "brand", 50) val statsStr = s"""{"brand_histogram":$brandStats}""" (df .write .mode("overwrite") .format("qbeast") .option("columnsToIndex", "brand:histogram") .option("columnStats", statsStr) .save(targetPath))
Here is the list of things you should do before submitting this pull request:
Test can be found inio.qbeast.spark.utils.QbeastUtilsTest
io.qbeast.spark.utils.QbeastUtilsTest
Test Configuration:
Description
Fixes #331
Type of change
New Feature, no breaking change.
Easy API for computing the histogram for a column. Usage:
Checklist:
Here is the list of things you should do before submitting this pull request:
How Has This Been Tested? (Optional)
Test can be found in
io.qbeast.spark.utils.QbeastUtilsTest
Test Configuration: