elixir-explorer / explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
https://hexdocs.pm/explorer
MIT License
1.08k stars 116 forks source link

Join on columns of type `:list` #956

Closed raulrpearson closed 1 month ago

raulrpearson commented 1 month ago

When I try this:

a = DF.new(%{
  numbers: [1, 2, 3, 4],
  lists: [[1, 2, 3], [1, 2, 3], [4, 5, 6], [4, 5, 6]]
})

b = DF.new(%{
  letters: ["a", "b", "c", "d"],
  lists: [[1, 2, 3], [1, 2, 3], [4, 5, 6], [4, 5, 6]]
})

DF.join(a, b)

I get this error message:

** (RuntimeError) Polars Error: not yet implemented: Hash Inner Join between list[i64] and list[i64]
    (explorer 0.9.0) lib/explorer/polars_backend/shared.ex:80: Explorer.PolarsBackend.Shared.apply_dataframe/4

Should joining on columns of list type be possible? Is it on the roadmap? If not, maybe we could mention in the docs.

josevalim commented 1 month ago

The availability of the feature is backend specific, which is why we don't mention it explicitly in the docs. The feature has to be requested/implemented on Polars :)

raulrpearson commented 1 month ago

Oh, okay, got it. I thought this was pending implementation in Explorer.PolarsBackend somehow, but I now see that message is actually coming from the Polars Rust code (this line, I assume). Thanks, sorry for the confusion!