alteryx / woodwork

Woodwork is a Python library that provides robust methods for managing and communicating data typing information.
https://woodwork.alteryx.com
BSD 3-Clause "New" or "Revised" License
144 stars 20 forks source link

Keep floats ending in 0 as `Double` instead of `Integer` or `IntegerNullable` #1496

Open ParthivNaresh opened 2 years ago

ParthivNaresh commented 2 years ago

Currently a column of floats ending in 0 such as [1.0, 3.0, 12.0, etc] is inferred as an Integer ltype or IntegerNullable if there are null values in it.

This should be kept as Double to support downstream uses of imputation, feature engineering, and machine learning.

thehomebrewnerd commented 2 years ago

@ParthivNaresh I don't think I agree with this. If there is no information after the decimal, I believe those values should be inferred and stored as integers. If users need these numeric columns as floating points, the Double logical type should be specified rather than relying on inference.

chukarsten commented 2 years ago

I think this is a particularly weird one. I agree that the most natural thing is to look at a column and realize that the numbers there are actually an integer. But I also feel, from an N=1 user perspective, when I'm working in Python and handjamming some math or something, if I put in the decimal point I'm expecting my data to be treated like a float ever after. I can see it going both ways.