Open dalejung opened 1 week ago
drop_nulls
materializes a new column without nulls. This requires allocating memory.
@ritchie46
But each subgroup is only 2 rows. Why would the drop_nulls
version cause OOM?
df['price'].drop_nulls()
doesn't eat up memory.
The non drop_nulls memory uses ~5gb of memory. The drop_nulls
version eats up 64GB before causing OOM.
Checks
Reproducible example
Log output
No response
Issue description
I noticed that I'm getting OOM when using
drop_nulls()
with large amount of data. I get expected memory usage when not usingdrop_nulls()
Expected behavior
No memory issues. Can't think of why
drop_nulls()
should increase memory usage specially since each interval bar contains only 2 data points.Installed versions