scikit-learn-contrib / category_encoders

A library of sklearn compatible categorical variable encoders
http://contrib.scikit-learn.org/category_encoders/
BSD 3-Clause "New" or "Revised" License
2.4k stars 393 forks source link

Equivalent method to sklearn's partial_fit? #411

Closed ImSo3K closed 1 year ago

ImSo3K commented 1 year ago

I wanted to know if its possible to do an incremental / online learning with the category encoders? I have a CSV file that I read in chunks because of its size + I will probably encounter new features in inference time. Maybe there is a potential workaround?

PaulWestenthanner commented 1 year ago

The encoders do not a partial fit, however most encoders implement strategies for handling new data at predict time. This is done via the handle_unknown parameter. There are several options (raising an error, returning None, return some sensible value). Please check the docs for the encoder you want to use