scikit-learn / enhancement_proposals

Enhancement proposals for scikit-learn: structured discussions and rational for large additions and modifications
https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
47 stars 34 forks source link

SLEP013: n_features_out_ #29

Closed adrinjalali closed 4 years ago

adrinjalali commented 4 years ago

n_features_out_ is closely related to n_features_in_, which is accepted and close to being merged.

https://github.com/scikit-learn/scikit-learn/pull/14241 by @amueller is the implementation of this SLEP.

This is short, but it makes me feel I'm missing some obvious important issue...

adrinjalali commented 4 years ago

What's the interaction with n_features_in_ in pipelines? Let's say after fit, step 2 of a pipeline has inferred n_features_in_ to be 3 but n_features_out_ of step 1 claims it's 4. One of them is wrong, do we error?

I think we do either try to enforce consistency of that between fit and transform in estimators, or some operation raises an error. Checking the consistency could be added, not sure if we should put that in this SLEP. To me it's kind of a next step.

NicolasHug commented 4 years ago

Checking the consistency could be added, not sure if we should put that in this SLEP. To me it's kind of a next step.

OK Agreed. That can get tricky because the discrepancy might come from a user-input issue, or from a wrong computation on our side in which case we don't want to error

adrinjalali commented 4 years ago

Seems like this one is going to be an easy one? I almost can't believe it :P should we merge and take a vote then?

NicolasHug commented 4 years ago

Thanks Adrin. Wanna call for a vote? We can ask for votes to be made on a PR / issue to avoid flooding the mailing list, as suggested