microsoft / SynapseML

Simple and Distributed Machine Learning
http://aka.ms/spark
MIT License
5.01k stars 829 forks source link

how to export pmml for FindBestModel? #741

Open rudyMcgee opened 4 years ago

rudyMcgee commented 4 years ago
val tmp :Dataset[Row]= .....
val lr = TrainRegressionUtils.createLR.setLabelCol(tmp.schema.last.name)
val dt = TrainRegressionUtils.createDT.setLabelCol(tmp.schema.last.name) 
val rf = TrainRegressionUtils.createRF.setLabelCol(tmp.schema.last.name)
val gbt = TrainRegressionUtils.createGBT.setLabelCol(tmp.schema.last.name)

val model_lr = lr.fit(data)
val model_dt = dt.fit(data)
val model_gbt = gbt.fit(data)
val model_rf = rf.fit(data)

val findBestModel = new FindBestModel()
  .setModels(Array(model_lr, model_dt, model_gbt, model_rf))
  .setEvaluationMetric(MetricConstants.RmseSparkMetric)

val bestModel = findBestModel.fit(data)

how to export pmml?

welcome[bot] commented 4 years ago

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.

imatiach-msft commented 4 years ago

hi @rudyMcgee can the underlying models be exported with PMML (TrainRegressionUtils.createLR/DT/RF/GBT)? If so, you can get the underlying model from FindBestModel (bestModel.getBestModel, see https://github.com/Azure/mmlspark/blob/master/src/main/scala/com/microsoft/ml/spark/automl/FindBestModel.scala#L172) and export it.