Closed mcharles closed 6 years ago
Apologies for not having any reproducible examples on hand
Very difficult to answer your question without seeing the R code. I assume that your input data.frame
contains missing values, and that they are passed to the gbm::gbm
function (via some sort of dismo
wrapper) as-is.
The PMML document that is generated for the gbm::gbm
model type contains three-way splits. The first split is a SimplePredicate
element, which checks if the value is missing using the isMissing
PMML built-in function. Hence, the PMML file is ready to handle missing values.
When we try to run the PMML (v 4.3) file in Syncfusion, we are getting errors when it encounters a missing/null value for one of our continuous predictors.
Can you score the PMML model with sample data using the org.jpmml.evaluator.EvaluationExample
command-line application from the JPMML-Evaluator project? See https://github.com/jpmml/jpmml-evaluator#example-applications
I have a reason to think that JPMML-Evaluator will score your PMML file just fine, and your problem is related to Syncfusion. However, if the JPMML-Evaluator also fails (or gives bad predictions), then please let me know about it.
Apologies for not having any reproducible examples on hand - but wondering if there is any quick answer or point in the right direction for this particular issue:
I have constructed a GBDT model in R using the dismo package, and then used the latest r2pmml package to generate a PMML file. When we try to run the PMML (v 4.3) file in Syncfusion, we are getting errors when it encounters a missing/null value for one of our continuous predictors. We've tried several different ways to pass this missing value to no avail.
Our working assumption is that this is a PMML / Syncfusion issue that we need to solve, given that the GBDT algorithm handles missing values from continuous variables just fine. But anyone know if we are off track here?