abstractqqq / polars_ds_extension

Polars extension for general data science use cases
MIT License
263 stars 17 forks source link

Incorrect `Lstsq: #Data < #features` error #141

Closed erinov1 closed 2 months ago

erinov1 commented 2 months ago

Currently, the line https://github.com/abstractqqq/polars_ds_extension/blob/b5b58776740d4627f1285423f7339d9ca2605fd7/src/num/ols.rs#L91-L94 is intended to error out if #Data < #features, or equivalently #Data + 1 <= #features. However, in practice it is actually erroring out when #Data <= #features + 1 (since ncols includes features as well as the target).

For example:

df = pl.DataFrame({"Y": [1], "X": [1]})
df.select(pds.query_lstsq(pl.col("X"), target=pl.col("Y"), add_bias=False))

ComputeError: the plugin failed with message: Lstsq: #Data < #features. No conclusive result.
abstractqqq commented 2 months ago

https://github.com/abstractqqq/polars_ds_extension/pull/143