salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.24k stars 393 forks source link

version: 0.6.1 Error in Saving Fitted Workflows #446

Closed bderoy closed 4 years ago

bderoy commented 4 years ago

Describe the bug As per documentation https://docs.transmogrif.ai/en/stable/developer-guide/#saving-fitted-workflows when I am trying to save the model then getting runtime exception:

  at com.salesforce.op.stages.OpPipelineStageWriter.writeToJson(OpPipelineStageWriter.scala:81)**
  at com.salesforce.op.OpWorkflowModelWriter$$anonfun$3.apply(OpWorkflowModelWriter.scala:131)
  at com.salesforce.op.OpWorkflowModelWriter$$anonfun$3.apply(OpWorkflowModelWriter.scala:131)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
  at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
  at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
  at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
  at com.salesforce.op.OpWorkflowModelWriter.stagesJArray(OpWorkflowModelWriter.scala:131)
  at com.salesforce.op.OpWorkflowModelWriter.stagesJArray(OpWorkflowModelWriter.scala:108)
  at com.salesforce.op.OpWorkflowModelWriter.toJson(OpWorkflowModelWriter.scala:83)
  at com.salesforce.op.OpWorkflowModelWriter.toJsonString(OpWorkflowModelWriter.scala:68)
  at com.salesforce.op.OpWorkflowModelWriter.saveImpl(OpWorkflowModelWriter.scala:58)
  at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:103)
  at com.salesforce.op.OpWorkflowModelWriter$.save(OpWorkflowModelWriter.scala:193)
  at com.salesforce.op.OpWorkflowModel.save(OpWorkflowModel.scala:221)
  ... 60 elided
Caused by: java.lang.RuntimeException: Argument 'extractFn' [$anonfun$10] cannot be serialized. Make sure $anonfun$10 has either no-args ctor or is an object, and does not have any external dependencies, e.g. use any out of scope variables.**
  at com.salesforce.op.stages.OpPipelineStageSerializationFuns$class.serializeArgument(OpPipelineStageReaderWriter.scala:234)
  at com.salesforce.op.stages.DefaultValueReaderWriter.serializeArgument(DefaultValueReaderWriter.scala:48)
  at com.salesforce.op.stages.DefaultValueReaderWriter$$anonfun$write$1.apply(DefaultValueReaderWriter.scala:70)
  at com.salesforce.op.stages.DefaultValueReaderWriter$$anonfun$write$1.apply(DefaultValueReaderWriter.scala:69)
  at scala.util.Try$.apply(Try.scala:192)
  at com.salesforce.op.stages.DefaultValueReaderWriter.write(DefaultValueReaderWriter.scala:69)
  at com.salesforce.op.stages.FeatureGeneratorStageReaderWriter.write(FeatureGeneratorStage.scala:189)
  at com.salesforce.op.stages.FeatureGeneratorStageReaderWriter.write(FeatureGeneratorStage.scala:129)
  at com.salesforce.op.stages.OpPipelineStageWriter.writeToJson(OpPipelineStageWriter.scala:80)
  ... 76 more
Caused by: java.lang.RuntimeException: Failed to create an instance of class '$anonfun$10'. Class has to either have a no-args ctor or be an object.
  at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:106)**
  at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:87)
  at com.salesforce.op.stages.OpPipelineStageSerializationFuns$class.serializeArgument(OpPipelineStageReaderWriter.scala:231)
  ... 84 more
Caused by: java.lang.NoSuchFieldException: MODULE$
  at java.lang.Class.getField(Class.java:1703)
  at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:102)
  ... 86 more

To Reproduce below lines of code

val workflow = new OpWorkflow().setResultFeatures(prediction, labels).setReader(trainDataReader)
val fittedWorkflow = workflow.train()
println("Summary:\n" + fittedWorkflow.summaryPretty())
fittedWorkflow.save(path = "/my/model/path", overwrite = true)

Expected behavior The model should be saved without error in desired location

tovbinm commented 4 years ago

Can you please share the section of the code where you define FeatureBuilder instances?

bderoy commented 4 years ago

@tovbinm Thank you for responding. pls refer to below code snippet.

val stage = FeatureBuilder.Text[opportunity].extract(_.stage.toText).asResponse
val entity = FeatureBuilder.PickList[opportunity].extract(_.entity.map(_.toString).toPickList).asPredictor
val company = FeatureBuilder.Text[opportunity].extract(_.company.toText).asPredictor
val industry = FeatureBuilder.PickList[opportunity].extract(_.industry.map(_.toString).toPickList).asPredictor
val region =FeatureBuilder.PickList[opportunity].extract(_.region.map(_.toString).toPickList).asPredictor
val msp_mme = FeatureBuilder.Text[opportunity].extract(_.msp_mme.toText).asPredictor
val account_type = FeatureBuilder.Text[opportunity].extract(_.account_type.toText).asPredictor
val billing_state = FeatureBuilder.PickList[opportunity].extract(_.billing_state.map(_.toString).toPickList).asPredictor
val billing_country = FeatureBuilder.PickList[opportunity].extract(_.billing_country.map(_.toString).toPickList).asPredictor
val account_customer_status = FeatureBuilder.PickList[opportunity].extract(_.account_customer_status.map(_.toString).toPickList).asPredictor
val opportunity_type = FeatureBuilder.PickList[opportunity].extract(_.opportunity_type.map(_.toString).toPickList).asPredictor
val forecast_category = FeatureBuilder.PickList[opportunity].extract(_.forecast_category.map(_.toString).toPickList).asPredictor
val currency_code = FeatureBuilder.PickList[opportunity].extract(_.currency_code.map(_.toString).toPickList).asPredictor
val close_qtr = FeatureBuilder.Text[opportunity].extract(_.close_qtr.toText).asPredictor
val created_qtr = FeatureBuilder.Text[opportunity].extract(_.created_qtr.toText).asPredictor
val type1 = FeatureBuilder.Text[opportunity].extract(_.type1.toText).asPredictor
val owner_role = FeatureBuilder.Text[opportunity].extract(_.owner_role.toText).asPredictor
tovbinm commented 4 years ago

Aha, thank you. Now it's clear. You need to create a concrete class for each of the extract functions. For example:

object Extractors {
   class StageExtractor extends Function[opportunity, Text] with Serializable {
      def apply(o: opportunity): Text = o.stage.toText
   }
   // define classes for all the other feature extractors 
}

import Extractors._
val stage = FeatureBuilder.Text[opportunity].extract(new StageExtractor).asResponse
// and the same for all other feature builders

Here is a full working example for Titanic features - https://github.com/salesforce/TransmogrifAI/blob/master/helloworld/src/main/scala/com/salesforce/hw/titanic/TitanicFeatures.scala#L39