ankane / ruby-polars

Blazingly fast DataFrames for Ruby
Other
860 stars 34 forks source link

BigDecimals are not handled correctly when constructing DataFrames in some circumstances #43

Closed pstalcup closed 10 months ago

pstalcup commented 10 months ago

When constructing DataFrames using BigDecimal as inputs you get odd behavior depending on how you construct the DataFrame:

Polars::DataFrame.new({a: [BigDecimal("0.1e-2")] * 2})
# produces 
# shape: (2, 1)
# ┌────────────┐
# │ a          │
# │ ---        │
# │ decimal[3] │
# ╞════════════╡
# │ 0.001      │
# │ 0.001      │
# └────────────┘
Polars::DataFrame.new([{a: BigDecimal("0.1e-2")}] * 2)
# produces
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# │ 1.0 │
# └─────┘

Using different schemas produces the same results

Polars::DataFrame.new({a: [BigDecimal("0.1e-2")] * 2}, schema: { "a" => Polars::Float64 })
Polars::DataFrame.new([{a: BigDecimal("0.1e-2")}] * 2, schema: { "a" => Polars::Decimal.new(3, 1) })

The Decimal type is marked as experimental, but I am surprised it is giving such incorrect results when I don't want the decimal type.

ankane commented 10 months ago

Hi @pstalcup, on the latest version, the last three examples should return:

shape: (2, 1)
┌───────┐
│ a     │
│ ---   │
│ f64   │
╞═══════╡
│ 0.001 │
│ 0.001 │
└───────┘

The next release should also fix the data type:

shape: (2, 1)
┌──────────────┐
│ a            │
│ ---          │
│ decimal[*,3] │
╞══════════════╡
│ 0.001        │
│ 0.001        │
└──────────────┘
pstalcup commented 10 months ago

Thank you - I was able to update my version and verify that it was working correctly!