h2oai / sparkling-water

Sparkling Water provides H2O functionality inside Spark cluster
https://docs.h2o.ai/sparkling-water/3.3/latest-stable/doc/index.html
Apache License 2.0
966 stars 360 forks source link

Save H2o Sparkling water models to Disk #2851

Closed webcluster4u closed 1 year ago

webcluster4u commented 1 year ago

I have a PySpark code to train an H2o DRF model. I need to save this model to disk and then load it.

from pysparkling.ml import H2ODRF
drf = H2ODRF(featuresCols = predictors,
                labelCol = response,
                columnsToCategorical = [response])

I can not find any document on this so I am asking this question here.

mn-mikke commented 1 year ago

Here is one example: https://docs.h2o.ai/sparkling-water/2.4/latest-stable/doc/deployment/pysparkling_pipeline.html

In general SW implepents spark pipeline API. So The SW models have the same methods as Model from Spark mllib: https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.Pipeline.html