Open adrinjalali opened 7 months ago
I think making rename_categories accept this might make the most sense, the solution you arrived at is probably the best case at the moment but obviously not great
cc @jbrockmendel
FWIW I ended up with this (not great either), that I find a bit more readable (but this may depend on the reader :wink:):
import pandas as pd
df = pd.DataFrame({'col': ["a", "b", "c"]}, dtype="category")
df['col'] = df['col'].astype(object).replace(to_replace="a", value="b").astype("category")
i think eventually we want users to do obj.replace('a', 'b').cat.remove_unused_categories()
. That works now, but the .replace issues a warning. i guess we could update the warning message to suggest this pattern for that particular use case
@jbrockmendel your code gives this warning now:
FutureWarning: The behavior of Series.replace (and DataFrame.replace) with CategoricalDtype is deprecated. In a future version, replace will only be used for cases that preserve the categories. To change the categories, use ser.cat.rename_categories instead.
I'm not sure if you want to remove the warning in this case, or to suggest a different solution?
Thanks for this thread.
That works now, but the .replace issues a warning. i guess we could update the warning message to suggest this pattern for that particular use case.
The warning says:
In a future version, replace will only be used for cases that preserve the categories.
I would have expected a warning only if I were introducing NEW categories. If I'm just consolidating existing categories, there is no need for the dtype to change (thus, the categories can be preserved, even if some are now unused). Why is a warning necessary at all?
Working on making scikit-learn's code pandas=2.2.0 compatible, here's a minimal reproducer for where I started:
which results in:
The first pattern doesn't apply here, so from this message, I understand I should do:
But this also fails with:
With a bit of reading docs, it seems I need to do:
which fails with
So
rename_categories
is not the one I want apparently, but reading through the "see also":None of them seem to do what I need to do.
So it seems the way to go would be:
Which is far from what the warning message suggests.
So at the end:
Series.cat
to do this easier?