Closed dibus2 closed 6 years ago
.. in which there are several values that could be considered as missing_values. What would be the best way to approach this?
It never occurred to me that Domain.missing_values
should be an array-like attribute instead of a scalar attribute.
Can't think of a good workaround at the moment - needs to be done "properly", by actually implementing it into Python and Java sides of the codebase.
I would like to create a feature which is tracking is there are missing values in a column
There is a non-standard transformer class sklearn2pmml.preprocessing.ExpressionTransformer
, which lets you check "missingness" usingpandas.isnull(X)
and pandas.notnull(X)
functions:
https://github.com/jpmml/sklearn2pmml/blob/master/sklearn2pmml/preprocessing/__init__.py#L44-L54
Something like this should do:
from sklearn2pmml.preprocessing import ExpressionTransformer
DataFrameMapper([
(['Var2'], ExpressionTransformer("pandas.isnull(X['Var2'])"))
])
Hi,
I am trying to achieve something along the following lines:
in which there are several values that could be considered as missing_values. What would be the best way to approach this? Similarly, I would like to create a feature which is tracking is there are missing values in a column:
and then use a FunctionTransformer for instance:
This works in python but I cannot export it to PMML.
Thanks for you suggestions.
Cheers,
F.