elixir-explorer / explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
https://hexdocs.pm/explorer
MIT License
1.12k stars 123 forks source link

Regression in `DataFrame.concat_rows/2` in v0.8.2 #902

Closed maennchen closed 5 months ago

maennchen commented 7 months ago

Script

Mix.install([{:explorer, "0.8.2"}])

df_a = Explorer.DataFrame.new([%{a: 1, b: 2}])

df_b = Explorer.DataFrame.new([%{b: 3}])
df_b = Explorer.DataFrame.put(df_b, :a, Explorer.Series.from_list([4]))

dbg Explorer.DataFrame.concat_rows(df_a, df_b)

Expected

(This is the result produced for 0.8.1.)

[script.exs:8: (file)]
Explorer.DataFrame.concat_rows(df_a, df_b) #=> #Explorer.DataFrame<
  Polars[2 x 2]
  a s64 [1, 4]
  b s64 [2, 3]
>

Actual

** (RuntimeError) Polars Error: lengths don't match: unable to vstack, column names don't match: "a" and "b"
    (explorer 0.8.2) lib/explorer/polars_backend/shared.ex:79: Explorer.PolarsBackend.Shared.apply_dataframe/4
    script.exs:8: (file)
philss commented 7 months ago

@maennchen I can confirm this is a regression. Thank you!

In order to mitigate this before we launch a new version, please use the Explorer.DataFrame.relocate/3 function as a workaround:

df_c = Explorer.DataFrame.relocate(df_b, "b", after: "a")

Explorer.DataFrame.concat_rows(df_a, df_c)