dlab-berkeley / Python-Machine-Learning

D-Lab's 6 hour introduction to machine learning in Python. Learn how to perform classification, regression, clustering, and do model selection using scikit-learn in Python.
Other
79 stars 67 forks source link

03_preprocessing -- `sparse` is `sparse_output` for newer versions of sklearn's OneHotEncoder #69

Open jellomoat opened 4 days ago

jellomoat commented 4 days ago

ISSUE: sparse is sparse_output for newer versions of sklearn's OneHotEncoder

LOCATION:

PROPOSED SOLUTION: Replace sparse with sparse_output on this line =>

dummy_e = OneHotEncoder(categories='auto', drop='first', sparse=False)

Or, add a comment noting this potential change for newer sklearn versions. This error came up during a consulting session debugging issues a consultee encountered running the preprocessing notebook (https://github.com/dlab-consulting/requests/issues/2876).

RELATED REFERENCE: https://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.OneHotEncoder.html

jellomoat commented 4 days ago

@tomvannuenen / @pssachdeva -- LMK if you'd like me to submit a PR for this! And if so, how you'd prefer it handled (either of the two proposed solutions, or something else entirely)