elixir-explorer / explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
https://hexdocs.pm/explorer
MIT License
1.12k stars 123 forks source link

Series.from_list/2 throws argument error when opts != [] in some cases #1000

Closed kylewhite21 closed 1 month ago

kylewhite21 commented 1 month ago

In many cases casting to a series and converting the data type in one shot fails, but converting to a series and then casting works. I'm not rusty enough to know why.

iex(2)> S.from_list([1, 2], dtype: :string)
** (ArgumentError) argument error
    (explorer 0.9.2) Explorer.PolarsBackend.Native.s_from_list_str("", [1, 2])
    (explorer 0.9.2) lib/explorer/polars_backend/series.ex:24: Explorer.PolarsBackend.Series.from_list/2
    iex:2: (file)
iex(2)> S.from_list([1, 2]) |> S.cast(:string)
#Explorer.Series<
  Polars[2]
  string ["1", "2"]
>

other examples:

alias Explorer.Series, as: S

# opts != [] errors
S.from_list(["1", "2"], dtype: {:u, 16})
S.from_list(["1", "2"]) |> S.cast({:u, 16})

# opts != [] errors
S.from_list([1.5, 2.5], dtype: {:u, 16})
S.from_list([1.5, 2.5]) |> S.cast({:u, 16})

# both work
S.from_list([1, 2], dtype: {:f, 32})
S.from_list([1, 2]) |> S.cast({:f, 32})
josevalim commented 1 month ago

Yes, this is expected. from_list mostly does not perform casting except few cases (like ints/floats), you must pass the exact data type. That's because it is very hard and expensive to do the casting at that level. Once you have the data structure in memory and you know exactly what it looks like, it is much easier.

kylewhite21 commented 1 month ago

Got it, thanks for the fast reply!

josevalim commented 1 month ago

Fantatic, and i pushed some docs to avoid future confusion!