snowflakedb / snowflake-ml-python

Apache License 2.0
43 stars 12 forks source link

Support for Nullable Data Types in Pandas and Snowpark DataFrames During Input Validation #122

Closed kenkoooo closed 2 weeks ago

kenkoooo commented 4 weeks ago

It seems that nullable strings, integers, and booleans are not supported as input types. When passing a pandas DataFrame, each column is converted to numpy arrays and validated with np.dtype, which doesn't support certain nullable types. Similarly, when passing a Snowpark DataFrame, it is converted using the signature type, which also relies on np.dtype.

Can we improve validation to use pandas dtypes instead of numpy dtypes, given that some ML models like LightGBM can work with nullable columns?

sfc-gh-afero commented 2 weeks ago

Hi @kenkoooo - this should be addressed in version 1.7.1. Please let us know if you run into any issues