salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.24k stars 392 forks source link

All models failed model selector or failed to finsih within 1 day!!! #450

Closed shenzgang closed 4 years ago

shenzgang commented 4 years ago

Throw the following error when I use the binary model for training.I wonder what causes this error detail error:

2019-12-18 15:04:35 [INFO] Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: java.lang.RuntimeException: All models failed model selector or failed to finsih within 1 day!!! Models tried were: 
OpLogisticRegression -> {
    OpLogisticRegression_000000000015-elasticNetParam: 0.1,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.001,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.1,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.01,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.1,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.1,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.1,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.2,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.5,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.001,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.5,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.01,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.5,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.1,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}, {
    OpLogisticRegression_000000000015-elasticNetParam: 0.5,
    OpLogisticRegression_000000000015-fitIntercept: true,
    OpLogisticRegression_000000000015-maxIter: 50,
    OpLogisticRegression_000000000015-regParam: 0.2,
    OpLogisticRegression_000000000015-standardization: true,
    OpLogisticRegression_000000000015-tol: 1.0E-6
}
OpRandomForestClassifier -> {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 3,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.001,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 3,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.01,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 3,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.1,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 6,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.001,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 6,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.01,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 6,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.1,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 12,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.001,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 12,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.01,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 12,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.1,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 10,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 3,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.001,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 3,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.01,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 3,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.1,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 6,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.001,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 6,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.01,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 6,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.1,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 12,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.001,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 12,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.01,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}, {
    OpRandomForestClassifier_000000000016-impurity: gini,
    OpRandomForestClassifier_000000000016-maxBins: 32,
    OpRandomForestClassifier_000000000016-maxDepth: 12,
    OpRandomForestClassifier_000000000016-minInfoGain: 0.1,
    OpRandomForestClassifier_000000000016-minInstancesPerNode: 100,
    OpRandomForestClassifier_000000000016-numTrees: 50,
    OpRandomForestClassifier_000000000016-subsamplingRate: 1.0
}
OpNaiveBayes -> {
    OpNaiveBayes_000000000017-smoothing: 1.0
}
OpDecisionTreeClassifier -> {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 3,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.001,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 3,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.01,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 3,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.1,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 3,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.001,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 3,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.01,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 3,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.1,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 6,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.001,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 6,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.01,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 6,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.1,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 6,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.001,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 6,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.01,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 6,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.1,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 12,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.001,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 12,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.01,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 12,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.1,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 10
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 12,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.001,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 12,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.01,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}, {
    OpDecisionTreeClassifier_000000000018-impurity: gini,
    OpDecisionTreeClassifier_000000000018-maxBins: 32,
    OpDecisionTreeClassifier_000000000018-maxDepth: 12,
    OpDecisionTreeClassifier_000000000018-minInfoGain: 0.1,
    OpDecisionTreeClassifier_000000000018-minInstancesPerNode: 100
}
    at com.salesforce.op.stages.impl.tuning.OpValidator$class.getSummary(OpValidator.scala:351)
    at com.salesforce.op.stages.impl.tuning.OpTrainValidationSplit.getSummary(OpTrainValidationSplit.scala:35)
    at com.salesforce.op.stages.impl.tuning.OpTrainValidationSplit.validate(OpTrainValidationSplit.scala:85)
    at com.salesforce.op.stages.impl.selector.ModelSelector.findBestEstimator(ModelSelector.scala:121)
    at com.salesforce.op.stages.impl.selector.ModelSelector.fit(ModelSelector.scala:153)
    at com.salesforce.op.stages.impl.selector.ModelSelector.fit(ModelSelector.scala:71)
    at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$20.apply(FitStagesUtil.scala:265)
    at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$20.apply(FitStagesUtil.scala:264)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
    at com.salesforce.op.utils.stages.FitStagesUtil$.com$salesforce$op$utils$stages$FitStagesUtil$$fitAndTransformLayer(FitStagesUtil.scala:264)
    at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$17.apply(FitStagesUtil.scala:227)
    at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$17.apply(FitStagesUtil.scala:225)
    at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
    at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
    at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:186)
    at com.salesforce.op.utils.stages.FitStagesUtil$.fitAndTransformDAG(FitStagesUtil.scala:225)
    at com.salesforce.op.OpWorkflow.fitStages(OpWorkflow.scala:389)
    at com.salesforce.op.OpWorkflow.train(OpWorkflow.scala:340)
    at workflow.models.training.MultiClassificationModel.train(MultiClassificationModel.scala:30)
    at workflow.utils.ModelTrainUtil$.train(ModelTrainUtil.scala:39)
    at workflow.ModelApp$.main(ModelApp.scala:66)
    at workflow.ModelApp.main(ModelApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)
)org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
tovbinm commented 4 years ago

Depending on the dataset size model training can take a while. If this is something you'd expect to happen, you can extend it by increasing the maxWait parameter value.

Otherwise I would advise to investigate it further. This can indicate that there is some problem with the training dataset, e.g. highly skewed and/or under partitioned. Another issue can be lack of memory allocated for executors, which also sometimes result in additional errors in log.

Have you tried to use smaller model grid in model selector instead of the default one?

shenzgang commented 4 years ago

No, I will try, thank you for your reply! In addition, I have another question about OpXGBoostClassifier. When I trained on OpXGBoostClassifier alone, the training was successful, Like this:

val prediction = BinaryClassificationModelSelector.withTrainValidationSplit (
            splitter = Option (splitter),
            modelTypesToUse = Seq (OpXGBoostClassifier),
            seed = seed
).setInput(response).getOutput()

But when I train with multiple models, XGB fails, Like this:

val prediction = BinaryClassificationModelSelector.withTrainValidationSplit (
          splitter = Option (splitter),
          modelTypesToUse = Seq (
                    OpLogisticRegression,
                    OpRandomForestClassifier,
                    OpGBTClassifier,
                    OpDecisionTreeClassifier,
                    OpXGBoostClassifier),
           seed = seed
).setInput(response).getOutput()

Why does this happen? And, XGB is easier to get good model metrics than other models, so why not add XGB to the default model selection? I'd like to know what kind of concern you have?

tovbinm commented 4 years ago

XGBoost requires a Rabbit Tracker process running on the Spark Driver node, which usually requires some additional configuration (especially for the Python version), and that's why XGBoost is off by default in our model selector.

Based on the information you provided I think that one of the Spark models is not able to converge on the provided dataset. Try figuring out which one by fitting each model type separately (same as you did with XGBoost).

shenzgang commented 4 years ago

When I trained XGBoots to be successful using it alone, I had no problems with my configuration and there was a Rabbit Tracker process on the driver node. But why does adding modelTypesToUse with other models fail! Why is that?

tovbinm commented 4 years ago

Usually the problem is with the training dataset, but it's difficult to pinpoint the exact reason without knowing which model type fails to train, as for example LR and RF behave differently. As I said above, try figuring out which model type fails by fitting each model type separately (same as you did with XGBoost).

shenzgang commented 4 years ago

I used the sample data of Titanic. When XGBoots training failed, the program would quit. The program would quit after all model attempts failed. I think there will still be some bugs here, looking forward to the next release! What about the next release?

tovbinm commented 4 years ago

Please try out TransmogrfiAI 0.7.0 with Spark 2.4