Closed enrir closed 1 year ago
Hey @enrir - thanks for brining this up. This isn't a desired behavior. I'll need to dive deeper into this, as I'm not sure I understand what actually happens here and what's the best way to handle it.
Hi @shakedzy, I forked the repo and started to write a possible fix. If it’s ok, I will open a pr when I have something ready. 😊
That's perfect :) thanks!
When the default strategy 'replace' is selected,
associations
raises the followingTypeError: Cannot setitem on a Categorical with a new category (0.0), set the categories first
if the input dataset is a Pandas DataFrame with some columns with dtype="category".See code below.
The problem is related to pandas fillna behaviour, see this stackoverflow question.
Given that the default strategy 'replace' with value
0.0
, I'm wondering if this case can be handled internally by theassociations
method or if this is a corner case. Often, thecategory
dtype is used when memory efficiency is important and switching dtype is expensive.