Open liuhuanshuo opened 1 year ago
Using sklearn2pmml converted pmml file, the default output is
[y,probability(1),probability(0)]
.
These three values are calculated all in one pass. Therefore, there will be no "performance benefit" to getting rid of the probability output fields other than "visual effect" (eg. keeping things extremely focussed on the screen).
In Scikit-Learn, it would take two passes (first predict(X)
, then predict_proba(X)
) to create such a results data matrix.
Is there a way to change the default column name, such as changing probability(1) to proba
Column renaming is covered in these recently opened issues: https://github.com/jpmml/sklearn2pmml/issues/359 and https://github.com/jpmml/sklearn2pmml/issues/361
There is a special API for renaming transformer fields, but not for renaming model fields.
Or can I select the column that I want, for example I only need to print y columns
yt = evaluator.evaluateAll(X)
# THIS!
yt = yt["y"]
You may consider wrapping the Evaluator.evaluate(X)
function call into a separate helper function, which adds/removes result columns as you wish.
I can see the benefit of adding a special-purpose API for disabling the generation of default Output
elements.
The easiest way would be such, where the end users signals his/her intent by setting a pmml_output = False
attribute on the (fitted-) model object:
classifier = ...
pipeline = PMMLPipeline([
("classifier", classifier)
])
pipeline.fit(X, y)
# Default config - the Output element is created
sklearn2pmml(pipeline, "classifier.pmml")
classifier.pmml_output = False
# Custom config - the Output element is not created
sklearn2pmml(pipeline, "classifier-no_proba.pmml")
Using sklearn2pmml converted pmml file, the default output is
[y,probability(1),probability(0)]
.Is there a way to change the default column name, such as changing
probability(1)
toproba
Or can I select the column that I want, for example I only need to print
y
columns, I don't need to default to outputprobability(1),probability(0)