pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
30.31k stars 1.96k forks source link

`clip()` with lower bound > upper bound panicks, but only in dev profile #19692

Open etiennebacher opened 5 days ago

etiennebacher commented 5 days ago

Checks

Reproducible example

import polars as pl

pl.DataFrame({"a": [1.0]}).select(pl.col("a").clip(lower_bound=10, upper_bound=0))

Log output

run ProjectionExec
thread '<unnamed>' panicked at /home/etienne/.cargo/registry/src/index.crates.io-6f17d22bba15001f/num-traits-0.2.19/src/lib.rs:403:5:
min must be less than or equal to max
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/home/etienne/Desktop/Git/polars/foo.py", line 7, in <module>
    print(pl.DataFrame({"a": [1.0]}).select(pl.col("a").clip(10, 1))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/etienne/Desktop/Git/polars/py-polars/polars/dataframe/frame.py", line 9024, in select
    return self.lazy().select(*exprs, **named_exprs).collect(_eager=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/etienne/Desktop/Git/polars/py-polars/polars/lazyframe/frame.py", line 2021, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: min must be less than or equal to max

Issue description

I'm not sure whether this is a bug or not but depending on the build profile, clip() either works or panicks when lower_bound > upper_bound:

My issue here is that I contribute to r-polars and when I want to test this behavior (in dev profile) I get a panick instead of a proper error. This issue disappears once I compile the package with release profile but then I can't run tests anymore. Therefore, I'd like to know if this is expected.

Expected behavior

Unsure, I'd say the behavior should be the same in dev and release profiles (but should it error or return 10 as in the release profile?).

Installed versions

``` --------Version info--------- Polars: 1.12.0 Index type: UInt32 Platform: Linux-6.8.0-47-generic-x86_64-with-glibc2.39 Python: 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] LTS CPU: False ----Optional dependencies---- adbc_driver_manager 1.2.0 altair 5.4.1 cloudpickle 3.1.0 connectorx 0.4.0 deltalake 0.21.0 fastexcel 0.12.0 fsspec 2024.10.0 gevent 24.10.3 great_tables 0.13.0 matplotlib 3.9.2 nest_asyncio 1.6.0 numpy 2.0.2 openpyxl 3.1.5 pandas 2.2.3 pyarrow 18.0.0 pydantic 2.9.2 pyiceberg sqlalchemy 2.0.36 torch xlsx2csv 0.8.3 xlsxwriter 3.2.0 ```
etiennebacher commented 5 days ago

Note that the same thing happens if I pass a NaN as lower bound:

import polars as pl
import numpy as np

pl.DataFrame({"a": [1.0]}).select(pl.col("a").clip(np.nan))
eitsupi commented 5 days ago

Seems related to rust-num/num-traits#134