thinkingmachines / geowrangler

🌏 A python package for wrangling geospatial datasets
https://geowrangler.thinkingmachin.es/
MIT License
48 stars 15 forks source link

generate_ookla_features: imputed_mean produces object type values instead of numerical #144

Closed tm-nicco closed 2 years ago

tm-nicco commented 2 years ago

[Description] I tried visualizing ookla internet speeds using folium and upon using mean values this error was produced:

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

[Sample data type output]

quadkey          object
geometry       geometry
devices_sum     float64
d_mbps_mean      object
d_mbps_max      float64
d_mbps_min      float64
d_mbps_std      float64
u_mbps_mean      object
u_mbps_max      float64
u_mbps_min      float64
u_mbps_std      float64
dtype: object

[Solution] Cast data into numerical / float64 due to imputed_mean setting its type as object:

phl_with_features['d_mbps_mean'] = phl_with_features['d_mbps_mean'].astype("float64")
phl_with_features['u_mbps_mean'] = phl_with_features['d_mbps_mean'].astype("float64")

[New data type output]

quadkey          object
geometry       geometry
devices_sum     float64
d_mbps_mean     float64
d_mbps_max      float64
d_mbps_min      float64
d_mbps_std      float64
u_mbps_mean     float64
u_mbps_max      float64
u_mbps_min      float64
u_mbps_std      float64
dtype: object

Link to Colab Notebook: https://colab.research.google.com/drive/1CDqqb5mTNMS3A0MaGkLOcCxqWlzumGDk

cc: @alronlam

butchtm commented 2 years ago

@tm-nicco can you share a gist or a colab notebook to replicate the issue? tia!

tm-nicco commented 2 years ago

@tm-nicco can you share a gist or a colab notebook to replicate the issue? tia!

Attached the link of my colab notebook on the description. Context: I tried visualizing ookla internet speeds on PH tiles. Upon using generate_ookla_features for my feature engineering process, imputed_mean produced object type values instead of numerical which triggered the error whenever I'm executing viz_choropleth.

butchtm commented 2 years ago

Hi @tm-nicco I was able to reproduce and find the source of the problem in the colab notebook In the following cell,

image

Please change the line outlined in RED to the following:

aoi = aoi.fillna(value=0)

to fix the problem.

Best regards, Butch

cc @alronlam - please update the demo integration notebook to fix the issue cc @tm-kah-alforja for issue status update

tm-nicco commented 2 years ago

Got it. Thank you, @butchtm!