Closed liuhuanshuo closed 1 year ago
Whether sklearn2pmml supports such functionality
The Scikit-Learn framework doesn't support the idea of "post-process the prediction of the final estimator step".
However, the SkLearn2PMML package allows you to do so, as explained here: https://openscoring.io/blog/2022/05/06/sklearn_prediction_postprocessing/
Is there any way to change the probability(0) and probability(1) to fractions with custom rules, such as
100 * probability(0) + 500 * probability(1)
?
# X is the predicted probabilities matrix, as returned by `pipeline.predict_proba(X)`
# The first column is probability(0), the second column is probability(1)
custom_score_transformer = ExpressionTransformer("(100 * X[0]) + (500 * X[1])")
pipeline = PMMLPipeline(..., predict_proba_transformer = custom_score_transformer)
I tried to do what you said
custom_score_transformer = ExpressionTransformer("(100 * X[0]) + (500 * X[1])")
mapper = DataFrameMapper(mapper_encode, input_df=True)
pipeline_final = PMMLPipeline(
steps=[("mapper", mapper),
("classifier", clf)],predict_proba_transformer = custom_score_transformer)
pipeline_final.predict_proba_transformer(x_oot_1)
However, it is a pity that the error was reported at the pipeline stage
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-781-8b73dad55740> in <module>
----> 1 pipeline_final.predict_proba_transformer(x_oot_1)
TypeError: 'NoneType' object is not callable
If I use pipeline_final.predict_proba_transform(x_oot_1)
,It will get the same result as If I use pipeline_final.predict_proba(x_oot_1)
>>> pipeline_final.predict_proba_transform(x_oot_1)
pipeline_final.predict_proba_transform(x_oot_1)
1
pipeline_final.predict_proba_transform(x_oot_1)
array([[0.75036584, 0.24963416],
[0.6775218 , 0.3224782 ],
[0.64144063, 0.35855937],
...,
[0.86458361, 0.13541639],
[0.92818167, 0.07181833],
[0.96084351, 0.03915649]])
>>> pipeline_final.predict_proba(x_oot_1)
array([[0.75036584, 0.24963416],
[0.6775218 , 0.3224782 ],
[0.64144063, 0.35855937],
...,
[0.86458361, 0.13541639],
[0.92818167, 0.07181833],
[0.96084351, 0.03915649]])
Doesn't seem to be working?
pipeline_final.predict_proba_transformer(x_oot_1)
TypeError: 'NoneType' object is not callable
I have no idea how the PMMLPipeline.predict_proba_transformer
attribute can be None
in this location. Did you re-assign the pipeline_final
object in some other line of code?
The value of this attribute should be ExpressionTransformer
there. This is a TransformerMixin
instance, which is not callable.
pipeline_final.predict_proba_transform(x_oot_1)
WTF is x_out_1
. Is it already some pipeline prediction?
You are supposed to invoke PMMLPipeline.predict_proba_transform(...)
method with the input data matrix X
. The PMMLPipeline
will do data transformations, and probability extraction automatically in that case:
https://github.com/jpmml/sklearn2pmml/blob/0.87.0/sklearn2pmml/pipeline/__init__.py#L109-L114
If I use pipeline_final.predict_proba_transform(x_oot_1),It will get the same result as If I use pipeline_final.predict_proba(x_oot_1)
This can only happen if the PMMLPipeline.predict_proba_transformer
attribute is not initialized (ie. is None
).
Doesn't seem to be working?
Your code is broken, not mine.
I can now use sklearn2pmml to generate pmml files to work with.
But I have a new question, is there a way to make the pmml file output a fraction instead of a probability of 0 or 1?
So let me be clear, I'm going to use the pmml file, and I'm going to do the following code, and it's going to come up with a probability
Is there any way to change the
probability(0)
andprobability(1)
to fractions with custom rules, such as100 *probability(0) + 500*probability(1)
?I wonder if this should add a step after the clf steof pipeline? Whether
sklearn2pmml
supports such functionality