elixir-explorer / explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
https://hexdocs.pm/explorer
MIT License
1.13k stars 123 forks source link

NIF Panic with duplicated key in rename #848

Closed pcapel closed 9 months ago

pcapel commented 9 months ago

Overview

It's a nit pick, but a NIF panic is a surprising error when a key is duplicated in the DataFrame.rename function. A more descriptive error would make a lot of sense.

It looks like there already some validation where this could fit in, happy to add support if this is deemed worthwhile.

Minimal reproduction via the livebook below.

Reproduce Panic

Mix.install([
  {:explorer, "~> 0.8.0"}
])

Section

alias Explorer.DataFrame, as: DF
require Explorer.DataFrame
Explorer.DataFrame
DF.new(%{key1: [1, 2, 3, 4], key2: [5, 6, 7, 8]})
|> DF.rename(
  key1: "first_key",
  key1: "second_key"
)
pcapel commented 9 months ago

I see that the linked function is for the rename which is acting against all columns. The reproduction would use the other clause, which is where the check would need to happen. Thinking something similar to maybe_raise_column_not_found?

billylanchantin commented 9 months ago

I agree we should validate before we pass off to Polars.

Old contents Also, I think there's a bug in the validation code? https://github.com/elixir-explorer/explorer/blob/main/lib/explorer/shared.ex#L225-L234 This ```elixir unless Map.has_key?(df.dtypes, name) do # ^^^^^^^^^ ``` should be ```elixir unless Map.has_key?(df.names, name) do # ^^^^^^^^ ``` right?

EDIT: No I was wrong about there maybe being a bug.