Closed testlambda693 closed 3 years ago
FFS, the top line of the Java stack trace indicates that this error is thrown by the JPMML-SkLearn library (org.jpmml.sklearn.*
). Why report it against the JPMML-Python library (org.jpmml.python.*
) then?
Corrected the title of the issue - this error is thrown to indicate that the direct input column(s) to the pipeline cannot be renamed.
In PMML speak, you can rename DerivedField
elements, but you cannot rename DataField
elements. It's a small technical restriction in the current state of JPMML conversion libraries, where not all field references have been made properly rename/relocation-proof. Perhaps it will be lifted in future versions
Right now, closing as "won't fix". If you're unhappy with direct input column names, then why don't you adjust your pandas.DataFrame
column names accordingly? For example, if you have a column called "A" in pandas.DataFrame
, then the first operation in your pipeline should not be to rename it to something else.
The workaround here - the [CategoricalDomain(), Alias(SimpleImputer(strategy = "constant", fill_value = $value))]
Python code fragment is not very effective. Better rewrite it as [CategoricalDomain(missing_value_replacement = $value)]
Also, if you're dealing with string input columns, then SimpleImputer(missing_values=np.nan)
is not effective at all. A string column cannot contain Numpy NaN values, ever.
Hi Villu,
i am trying to convert a feature to PMML and trying to perform the following sklearn2pmml(pipeline, out_file)
i get the following error
This is my code
but when i run the following i have no errors
Thanks