Closed CoderJoshDK closed 1 day ago
Yes, type-coercion is required. Maybe we should make it private.
I have seen it mentioned somewhere else that you are thinking about this. However, I have ran into a bug with the location of the decimal point in my columns, that is only fixed when I disable type-coercion. I have yet to create such a simplistic example like the one here. But please see #19871 for the bug I am talking about. Perhaps a footnote of some type could be added to that optimization, in the docs¿ Either way, even if this issue with the fill nulls is a "not going to fix," the other one is more of a real problem.
Then we must fix the bug. Type coercion is needed.
The bug I talked about above, doesn't actually seem related to type_coercion
. Disabling it coincidently fixed the bug rather than being the cause. See #20013 for more details on mentioned bug.
@ritchie46 feel free to close this issue if you are not going to make any changes based on the panic. Thank you for your time
Checks
Reproducible example
Log output
Issue description
When disabling
type_coercion
in a lazy frame and also doing afill_null
, the system panics. This isn't just for ints; the same happens for decimals. The obvious answer might seem to be "well just don't disabletype_coercion
" ... but that doesn't work for me right now since there is another bug thetype_coercion
is causing and I have to disable it to get around that.Casting observations
If I use a normal collect, the output is: ```py shape: (4, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ │ 2 │ │ 0 │ │ 3 │ └─────┘ ``` However, if I try to cast `a` before filling it: ```py >>> pl.LazyFrame({"a": [1,2, None, 3]}).select(pl.col("a").cast(pl.Int64).fill_null(0)).collect(type_coercion=False) pyo3_runtime.PanicException: implementation error, cannot get ref Int64 from Int32 ``` And finally, if I cast it to - specifically - Int32, it doesn't panic: ```py >>> pl.LazyFrame({"a": [1,2, None, 3]}).select(pl.col("a").cast(pl.Int32).fill_null(0)).collect(type_coercion=False) shape: (4, 1) ┌─────┐ │ a │ │ --- │ │ i32 │ ╞═════╡ │ 1 │ │ 2 │ │ 0 │ │ 3 │ └─────┘ ``` But this seems ... wrong. So it got me thinking; "maybe it is the `0` in the `fill_null` that is not aligned." And that led me to doing: ```py >>> pl.LazyFrame({"a": [1,2, None, 3]}).select(pl.col("a").cast(pl.Decimal).fill_null(0)).collect(type_coercion=False) pyo3_runtime.PanicException: implementation error, cannot get ref Decimal(None, Some(0)) from Int32 >>> pl.LazyFrame({"a": [1,2, None, 3]}).select(pl.col("a").cast(pl.Decimal).fill_null(pl.lit(0).cast(pl.Decimal))).collect(type_coercion=False) shape: (4, 1) ┌───────────────┐ │ a │ │ --- │ │ decimal[38,0] │ ╞═══════════════╡ │ 1 │ │ 2 │ │ 0 │ │ 3 │ └───────────────┘ ``` And yep, this looks to be roughly what is going on. Regardless, we shouldn't be panic'ing.
Expected behavior
The
null
values should be filled. It should not panic no matter what. And if a proper solution is impossible, this should be documented or mentioned somewhere.Installed versions