Closed lkarthee closed 9 months ago
Could we implement this without adding a struct
function? Could we automatically convert maps to structs instead?
@josevalim currently %{} works in mutate
as a top-level expression, but fails if it is input to any series function.
DF.mutate(df, c: %{a: a, b: b}) # works
DF.mutate(df, c: %{a: is_nil(a), b: is_nil(b)}) # works
DF.mutate(df, c: is_nil(%{a: a, b: b})) # fails
** (ArgumentError) expected a series as argument for is_nil, got: %{a: #Explorer.Series<
LazySeries[???]
s64 (column("a"))
>, b: #Explorer.Series<
LazySeries[???]
s64 (column("b"))
>}
(explorer 0.9.0-dev) lib/explorer/series.ex:6127: Explorer.Series.apply_series/3
How to tackle this ?
I am on my phone, but somewhere in lazy series we handle all literals, we should probably add map handling in there. The code will probably be pretty similar to the one you added to data frame, so we should probably find a way of sharing those as well.
I took a Quick Look and I was wrong. We only allow casting in specific operations in series.ex. For example, we could begin supporting maps in the comparison operators, if comparison is supported between structs. Outside of that, we most likely won’t support passing maps. There may be an argument we should allow literal (such as integers and maps) on is_nil, but that’s probably not the case today
There are some convenient use cases of structs - https://docs.pola.rs/user-guide/expressions/structs/#practical-use-cases-of-struct-columns .
Should we support passing struct to a series ? These would add value to mutating, filtering without mutating, etc
The question is: which operations should we support in on? For example, it doesn't make sense to support them on add or multiply. So I'd do operation per operation, at least initially.
Ok, let me explore more on this question and come back later.
I think this PR is complete for now.
:green_heart: :blue_heart: :purple_heart: :yellow_heart: :heart:
Add
struct
equivalentNote: Hiding original text as it is stale as per https://github.com/elixir-explorer/explorer/pull/855#issuecomment-1937523811
Original text
Add `struct` expression. ```elixir df = DF.new(%{a: [1, 2, 3], b: ["a", "b", "c"]}) #Explorer.DataFrame< Polars[3 x 2] a s64 [1, 2, 3] b string ["a", "b", "c"] > DF.mutate(df, c: struct([a: a, b: b])) #Explorer.DataFrame< Polars[3 x 3] a s64 [1, 2, 3] b string ["a", "b", "c"] c struct[2] [ %{"a" => 1, "b" => "a"}, %{"a" => 2, "b" => "b"}, %{"a" => 3, "b" => "c"} ] > Explorer.Series.struct(a: df["a"], b: df["b"]) #Explorer.Series< Polars[3] struct[2] [ %{"a" => 1, "b" => "a"}, %{"a" => 2, "b" => "b"}, %{"a" => 3, "b" => "c"} ] > ```