Closed willsthompson closed 1 year ago
unfortunately the tests are failing, I think this is because your approach might change the order of the categories. I've followed your idea and just implemented a very basic solution https://github.com/scikit-learn-contrib/category_encoders/blob/37fcf54613b0a23d52021862d8861600b51dc222/category_encoders/ordinal.py#L232-L233
This should keep the order and since sets are O(1) access time also solve the problem, although it might not be as elegant as yours. If you're happy with it, you can close the issue and PR. I hope I didn't make any mistake here Thanks for your effort!
@PaulWestenthanner I just tested on our sample data and this will work great, thanks for the quick response. Yours is actually slightly faster on the biggest intersections in our sample. Closing this PR and issue.
Fixes #407
Proposed Changes