elixir-nx / nx

Multi-dimensional arrays (tensors) and numerical definitions for Elixir
2.62k stars 190 forks source link

Cannot transform dummy columns to Nx Tensors via Nx.stack #1469

Closed shaqq closed 6 months ago

shaqq commented 6 months ago

The bug replication is below, exported from livebook in .livemd format. I believe the issue is that DF.dummies creates Series with dtype: :u8, and for whatever reason Nx.stack cannot turn unsigned ints to Tensors.

Even though Series.to_tensor works fine.

Bug with Dummy column -> Nx.stack

Mix.install([
  {:nx, "~> 0.7.1"},
  {:explorer, "~> 0.8.1"}
])

Section

alias Explorer.Series
require Explorer.DataFrame, as: DF
Explorer.DataFrame
dummy = DF.new(a: ["one", "two"]) |> DF.dummies(:a)
#Explorer.DataFrame<
  Polars[2 x 2]
  a_one u8 [1, 0]
  a_two u8 [0, 1]
>
dummy |> Nx.stack() # this fails
** (ArgumentError) no tensors were given to concatenate
    (nx 0.7.1) lib/nx.ex:14629: Nx.concatenate/2
    #cell:pmfunzs2uywdpdfc:1: (file)
dummy[:a_one] |> Series.to_tensor() # manually piping to Series.to_tensor works, strangely
#Nx.Tensor<
  u8[2]
  [1, 0]
>
# forced to manually cast all dummy columns to f64...
dummy
|> DF.to_series()
|> Enum.map(fn {k, v} -> {k, Series.cast(v, :f64)} end) # s64 and f32 work as well, but not s32 and ofc not u8
|> DF.new()
|> Nx.stack()
#Nx.Tensor<
  f64[2][2]
  [
    [1.0, 0.0],
    [0.0, 1.0]
  ]
>
shaqq commented 6 months ago

Let me know if this is actually an Explorer bug, but strikes me as something with Nx.stack at first glance

josevalim commented 6 months ago

Fixed in Explorer main, thank you!

shaqq commented 6 months ago

I believe this was the fix: https://github.com/elixir-explorer/explorer/commit/c4b391cc5d50efd1b579c6307cc25f466bb53a97

Thanks!!!