[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Unfortunately, I can't provide a minimal example, as I deal with large amounts of data I can't share, and this problem is only visible in such cases.
Log output
No response
Issue description
I noticed that after upgrading to polars 0.20.24 my large query started getting killed by OOM, on a 2TB machine. Previously, the query worked fine, consuming less than 200GB. The runs look like this:
Is there a minimal example that shows a memory increase? I do need something with syntetic data to be able to understand what happens. It doesn't have to OOM, just be a similar query.
Checks
Reproducible example
Unfortunately, I can't provide a minimal example, as I deal with large amounts of data I can't share, and this problem is only visible in such cases.
Log output
No response
Issue description
I noticed that after upgrading to polars 0.20.24 my large query started getting killed by OOM, on a 2TB machine. Previously, the query worked fine, consuming less than 200GB. The runs look like this:
The query looks like:
If, per https://github.com/pola-rs/polars/issues/15795, I put
collect
before.filter
inopps
, I get OOM even for 0.20.16.Expected behavior
I would expect that the recent versions finish without OOM.
Installed versions