skrub-data / skrub

Prepping tables for machine learning
https://skrub-data.org/
BSD 3-Clause "New" or "Revised" License
1.04k stars 91 forks source link

ENH allow set_output to fail #973

Closed jeromedockes closed 5 days ago

jeromedockes commented 6 days ago

oneachcolumn and onsubframe want dataframe output from their transformers. if the method exists, they call set_output on the transformer so that scikit-learn transformers produce dataframe output. however set_output may fail in cases where the output would be correct anyway, eg a Pipeline containing transformers that produce dataframes by default but not exposing a set_output method.

this pr allows set_output to fail and still attempts the transformation -- the type of the output is checked in any case