pola-rs / pyo3-polars

Plugins/extension for Polars
MIT License
232 stars 38 forks source link

perf: set compat_level when calling to_arrow #85

Closed ruihe774 closed 3 weeks ago

ruihe774 commented 1 month ago

This depends on https://github.com/pola-rs/polars/pull/17371.

In my naive benchmark, with large string columns, it improves the speed to ~10x.

ruihe774 commented 1 month ago

I'm still wondering how to ensure forward compatibility with future=True. When a old version of pyo3-polars is used together with a newer version of python polars which has introduced incompatible arrow data structure updates, it can be broken. One solution is to check the version of polars before setting future=True, and another solution is to introduce versioning like future=1.

As for backward compatibility, we can simply retry with future=True unset if it fails.

ruihe774 commented 1 month ago

https://github.com/pola-rs/polars/pull/17421

ruihe774 commented 1 month ago

pola-rs/polars#17421

I've made this PR work with it. In PySeries::into_py and PySeries::extract_bound, newest compatibility level supported by both sides is negotiated and selected automatically. In this way, both backward compatibility and forward compatibility are achieved.