elixir-explorer / explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
https://hexdocs.pm/explorer
MIT License
1.12k stars 123 forks source link

Do not infer if type has been given #923

Closed josevalim closed 5 months ago

josevalim commented 5 months ago

This is a WIP. We need to decide if we want to go ahead with this code or not. This assumes that, if the user specifies the dtype, the dtype will be taken precisely as is. We won't try to merge and fix things, which is arguably what we should do, but we need to fix the build.

To fix the build, we need to remember dtypes like {:duration, :milliseconds} must be first read as integers and then cast to the duration (perhaps this could be done directly in the Rust side).

billylanchantin commented 5 months ago

I think going ahead with this approach makes sense. There is a notion in Polars of the "physical" dtype:

If we mirror that on the Elixir side, that may help us pass things to Rust in a safer way. Like, from_list can tell Rust that it's a list of either {:duration, _} or to_physical({:duration, _}).

IDK if that truly makes it easier or not though. I'd have to dig into our code to say more.

josevalim commented 5 months ago

The challenge here is that if you pass from_list([1, 2, 3], dtype: {:duration, ...}), we need to either implement this logic in the backend OR tell the user to pass dtype as integer and then they explicitly cast it to duration. Both are doable for sure.

josevalim commented 5 months ago

Continuing in favor of #928.