swoop-inc / spark-alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
https://swoop-inc.github.io/spark-alchemy/
Apache License 2.0
187 stars 34 forks source link

Specifying custom precision #29

Open deepuak opened 1 year ago

deepuak commented 1 year ago

Hi,

I am not able to specify custom precision with below code and it errors out. Can some one please let me know the right way to pass custom precision? Please note i am using databricks for spark runtime.

import com.swoop.alchemy.spark.expressions.hll.functions._
val df1 = spark.table("hive_metastore.rwd_databricks.table_test")
df1.select("PATIENT_ID","CLAIM_ID","CODE").withColumn("patient_id_hll", hll_init("PATIENT_ID",0.02))  .select(hll_merge("patient_id_hll",0.02).as("patient_id_hll_m")).write.mode("overwrite").format("delta").saveAsTable("patient_id_hll_merge")