elixir-explorer / explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
https://hexdocs.pm/explorer
MIT License
1.12k stars 123 forks source link

Explorer.DataFrame.from_query/4 with :snowflake adapter returns dtype error on any numeric field #952

Closed adworse closed 3 months ago

adworse commented 3 months ago

Error is "Generic Error: cannot cast to dtype: decimal[38,0]"

{:error, "Generic Error: cannot cast to dtype: decimal[38,0]"}
    (explorer 0.8.3) lib/explorer/polars_backend/shared.ex:111: Explorer.PolarsBackend.Shared.df_dtypes/1
    (explorer 0.8.3) lib/explorer/polars_backend/shared.ex:97: Explorer.PolarsBackend.Shared.create_dataframe/1
    (explorer 0.8.3) lib/explorer/polars_backend/data_frame.ex:35: Explorer.PolarsBackend.DataFrame.from_query/3
    (explorer 0.8.3) lib/explorer/data_frame.ex:498: Explorer.DataFrame.from_query!/4

Error is reproducible in LiveBook with a trial SnowFlake AWS account. Unfortunately, my Rust knowledge is rather nonexistent. I would be happy to share my trial creds with whoever feels generous enough to dive into fixing this one!

Screenshot 2024-07-31 at 15 51 46
josevalim commented 3 months ago

We don't support decimals yet: #867. We will try to prioritize it sooner than later. :) Closing in favor of the other issue.

adworse commented 3 months ago

@josevalim Thank you for pointing this out! It seems like Snowflake reports its columns as decimal. Is there a chance one could force the type to float or int64 (since decimal[38,0] is, in fact, an integer)?

josevalim commented 3 months ago

You could try converting them to strings in your queries (explicitly select them as a string) and then cast them back to integers. The change you propose in itself would need to be requested to Snowflake's ADBC driver: https://github.com/apache/arrow-adbc