Closed oaksharks closed 6 years ago
I found that DataFrameMapper regard as initializer transformer, and should has no features befefore.
there will be verfiy in sklearn.Initializer.encodeFeatures
:
@Override
public List<Feature> encodeFeatures(List<Feature> features, SkLearnEncoder encoder){
ClassDictUtil.checkSize(0, features); // ensure no features before
return initializeFeatures(encoder);
}
So I try to override method in sklearn_pandas.DataFrameMapper
and remove the verify:
@Override
public List<Feature> encodeFeatures(List<Feature> features, SkLearnEncoder encoder){
return initializeFeatures(encoder);
}
It works now, but is it possible? @vruusmann Looking forward to your help .
It works now, but is it possible
You propose removing a "sanity check" - the code will execute, but it will most likely be producing insane/non-sensical results.
The solution is to use a FeatureUnion
step to combine two DataFrameMapper
steps together:
mapper_union = FeatureUnion([
("first", dfm),
("second", dfm1)
])
pipeline = PMMLPipeline([
("preprocessing", mapper_union),
("model", lr_estimator)
])
I haven't executed the above code (just typing based on my memory), but this is the pattern that you should be following.
Here is my code:
If use only one dataframemapper, it works well but double not , is there any advices ?