sfu-db / dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
http://dataprep.ai
MIT License
1.98k stars 203 forks source link

Ensure npartitions is passed to dask as int #903

Closed SultanOrazbayev closed 2 years ago

SultanOrazbayev commented 2 years ago

Using np.ceil return a float, which triggers errors downstream.

In [5]: import numpy as np

In [6]: np.ceil(1.0001)
Out[6]: 2.0

For a reproducible error example, see this question: https://stackoverflow.com/questions/72453608/dataprep-eda-typeerror-please-provide-npartitions-as-an-int-or-possibly-as-non.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Snapshots:

Include snapshots for easier review.

Checklist:

yixuy commented 2 years ago

Thank you for your correction!

SultanOrazbayev commented 2 years ago

Glad I could help! Unrelated, but I'm an SFU alum (econ). :)

qidanrui commented 2 years ago

Very glad to hear that! Thank you for using our library! Looking forward to hearing more bugs from you.