Closed GuiMarthe closed 3 months ago
Ah, I think I've found an ok implementation.
from sklearn.exceptions import NotFittedError
def formulaic_get_feat_names_out(self, names):
if not hasattr(self, 'model_spec_'):
raise NotFittedError('Model not fitted yet. Unable to get feature names.')
return sum([term.columns for term in self.model_spec_.structure], [])
FormulaicTransformer.get_feature_names_out = formulaic_get_feat_names_out
Not sure if I need to know how to handle names if they are given and the model is not fitted, which I think its the case for sklearn expected implementation. But in this case it works.
Hi @GuiMarthe ,
Thanks for reaching out! I don't use sklearn
much in my day-to-day work. Is this just a method I should add to the example that makes it also work with the ColumnTransformer
interface?
The easiest way to use the ModelSpec
to get column names is just: model_spec.column_names
. See https://matthewwardrop.github.io/formulaic/guides/model_specs/#anatomy-of-a-modelspec-instance for more details.
Ah... I see it documented here: https://scikit-learn.org/stable/glossary.html#term-get_feature_names_out . I'll add it to the example.
Hey folks! Awesome library being built here! So, I'm trying to setup compatibility with scikilearn's ColumnTransfomer, which by default returns numpy arrays and provides a
get_feature_names_out
method if you want to inspect the transformation in a Pandas DataFrame.How could that be built? I see in the documentation there is a suggestion for the Transformer implementation, so there could be a
get_feature_names_out
method right there.