pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
26.63k stars 1.63k forks source link

Plugins raise when passing `pl.lit(1)` without specifying dtype #16021

Closed MarcoGorelli closed 1 week ago

MarcoGorelli commented 2 weeks ago

Checks

Reproducible example

I've made a simple plugin, from the cookiecutter, with:

def add_one(expr: IntoExpr) -> pl.Expr:
    expr = parse_into_expr(expr)
    return register_plugin(
        args=[expr],
        symbol="add_one",
        is_elementwise=True,
        lib=lib,
    )

and

use polars::prelude::*;
use pyo3_polars::derive::polars_expr;

#[polars_expr(output_type=Int64)]
fn add_one(inputs: &[Series]) -> PolarsResult<Series> {
    let out = inputs[0].i64()?.apply_values(|x|x+1);
    Ok(out.into_series())
}

I then run

import polars as pl
from plugin_mcve import add_one

df = pl.DataFrame({
    'a': [1,2,3],
})
result = df.with_columns(a_plus_one = add_one('a'))
result = df.with_columns(a_plus_one = add_one(pl.lit(1)))
print(result)

Log output

thread '<unnamed>' panicked at crates/polars-arrow/src/ffi/schema.rs:482:35:
not implemented
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/home/marcogorelli/plugin_mcve/run.py", line 8, in <module>
    result = df.with_columns(a_plus_one = add_one(pl.lit(1)))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/marcogorelli/plugin_mcve/.venv/lib/python3.11/site-packages/polars/dataframe/frame.py", line 8054, in with_columns
    return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/marcogorelli/plugin_mcve/.venv/lib/python3.11/site-packages/polars/lazyframe/frame.py", line 1810, in collect
    return wrap_df(ldf.collect())
                   ^^^^^^^^^^^^^
pyo3_runtime.PanicException: not implemented

Issue description

From git bisect, this is due to #15832. I just wanted to check if it's intentional - are users now required to always specify dtype when passing pl.lit to plugins?

Expected behavior

Either an informative message, or for it to work as before

Installed versions

``` --------Version info--------- Polars: 0.20.23 Index type: UInt32 Platform: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Python: 3.11.9 (main, Apr 6 2024, 17:59:24) [GCC 11.4.0] ----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: deltalake: fastexcel: fsspec: gevent: hvplot: matplotlib: nest_asyncio: numpy: openpyxl: pandas: pyarrow: pydantic: pyiceberg: pyxlsb: sqlalchemy: xlsx2csv: xlsxwriter: ```