Closed eangius closed 2 months ago
For what it's worth, these local changes fixed things for me & kept tests passing. If anyone is willing to officialize this it'll be much appreciated.
diff --git a/category_encoders/ordinal.py b/category_encoders/ordinal.py
index 45d333e..94804c0 100644
--- a/category_encoders/ordinal.py
+++ b/category_encoders/ordinal.py
@@ -195,7 +195,7 @@ class OrdinalEncoder(util.BaseEncoder, util.UnsupervisedTransformerMixin):
# Convert to object to accept np.nan (dtype string doesn't)
# fillna changes None and pd.NA to np.nan
- X[column] = X[column].astype("object").fillna(np.nan).map(col_mapping)
+ X[column] = X[column].astype("object").infer_objects(copy=False).fillna(np.nan).map(col_mapping)
if util.is_category(X[column].dtype):
nan_identity = col_mapping.loc[col_mapping.index.isna()].array[0]
X[column] = X[column].cat.add_categories(nan_identity)
Thanks for reporting!
Your proposed fix seems fine, but I wonder whether something else might be better. The cast to object is just there (according to the comment) to accommodate np.nan as the fill, and we're about to map to numeric, so the dtype itself isn't critical information, and downcasting in particular isn't needed. Should we just opt in to the future behavior?
Expected Behavior
No
FutureWarning
is thrown.Actual Behavior
Currently the following warning is thrown.
Neither suppressing warnings, setting the pandas option or changing the types on caller side is sufficient for correctness.
Steps to Reproduce the Problem
CountEncoder
(or similar)Specifications