rapidsai-community / notebooks-contrib

RAPIDS Community Notebooks
Apache License 2.0
512 stars 266 forks source link

[REVIEW] Update mortgage_e2e to fix dtype errors during ETL #292

Closed ayushdg closed 4 years ago

ayushdg commented 4 years ago

The existing notebook reads string columns from csv files using dtype="category" which breaks with latest 0.14 cudf nightly. This PR fixes the error by reading the columns as their original dtype str and hash those values to an equivalent int column.