Open phobson opened 1 year ago
Tags: @phobson Hello, can I work on the issue titled "LabelEncoder doesn't handle missing values in dask series of strings #954".
@DuanBoomer I'd be happy to review a PR. Thanks for volunteering. Note that I'll be largely away from my computer this week through the New Year. So if my response time is slow, I haven't forgotten about you.
@phobson The PR will be submitted by Sunday if that's okay with you. Today is Monday.
Describe the issue:
When using a LabelEncoder on a dask series with missing values (as
np.nan
), a TypeError is raised with "<" being undefined for floats and strings.scikit-learn's encoder seems to handle this well for pandas and dask series. We seem to handle it well with a pandas series.
Minimal Complete Verifiable Example:
Full Trackback:
Environment: