sdv-dev / RDT

A library of Reversible Data Transforms
Other
116 stars 24 forks source link

Fix pandas FutureWarning in UniformEncoder #819

Closed R-Palazzo closed 4 months ago

R-Palazzo commented 4 months ago

Environment Details

Error Description

In the UniformEncoder, replacing NaN raises the following FutureWarning sometimes:

FutureWarning: The behavior of Series.replace (and DataFrame.replace) with CategoricalDtype is deprecated. 
In a future version, replace will only be used for cases that preserve the categories. 
To change the categories, use ser.cat.rename_categories instead.

It gets raised here: https://github.com/sdv-dev/RDT/blob/ecf749959276dacd54efc474edfdc5f7804e133e/rdt/transformers/categorical.py#L200

A fix here would be to use ser.cat.remove_categories()

Step to reproduce

from rdt.transformers import UniformEncoder

intervals = {
    ' United-States': [0.0, 0.8], None: [0.8, 0.9],' Jamaica': [0.9, 0.99]
}
data = pd.Series([0.107995, 0.148025, 0.632702], name='native-country', dtype=float)
transformer = UniformEncoder()
transformer.intervals = intervals
transformer.dtype = 'O'
transformer._reverse_transform(data)