Closed adrinjalali closed 4 years ago
What's the interaction with
n_features_in_
in pipelines? Let's say afterfit
, step 2 of a pipeline has inferredn_features_in_
to be 3 butn_features_out_
of step 1 claims it's 4. One of them is wrong, do we error?
I think we do either try to enforce consistency of that between fit
and transform
in estimators, or some operation raises an error. Checking the consistency could be added, not sure if we should put that in this SLEP. To me it's kind of a next step.
Checking the consistency could be added, not sure if we should put that in this SLEP. To me it's kind of a next step.
OK Agreed. That can get tricky because the discrepancy might come from a user-input issue, or from a wrong computation on our side in which case we don't want to error
Seems like this one is going to be an easy one? I almost can't believe it :P should we merge and take a vote then?
Thanks Adrin. Wanna call for a vote? We can ask for votes to be made on a PR / issue to avoid flooding the mailing list, as suggested
n_features_out_
is closely related ton_features_in_
, which is accepted and close to being merged.https://github.com/scikit-learn/scikit-learn/pull/14241 by @amueller is the implementation of this SLEP.
This is short, but it makes me feel I'm missing some obvious important issue...