Closed 2010YOUY01 closed 1 week ago
This PR does indeed fix https://github.com/apache/datafusion/issues/10511 😀. I just tested the branch, and the code that crashes in main works perfectly here
Thanks @2010YOUY01 -- will look at this later today or tomorrow
One thing to mentioned is how fast this method is? as I believe the method will be called frequently
This is a very good point, I think when doing the same fix at arrow side, we should cache the result inside RecordBatch
(if they're immutable)
I will run some benchmarks
Thank you all for the feedbacks! I've updated the followings:
Which issue does this PR close?
First step to fix https://github.com/apache/datafusion/issues/13089
Rationale for this change
Now
record_batch.get_array_memory_size()
will overestimate memory usage: If multiple array are pointing to the same underlying buffer, they will be counted repeatedly. A more detailed explanation can be found in this PR's comment:This function is used for spilled execution to estimate physical memory usage, this overestimation caused many bugs in memory-limited sort/aggregation/join. For example, if there is a
RecordBatch
with 10 columns, all of 10 columns are sharing the sameBuffer
, thenrecord_batch.get_array_memory_size()
will return a 10X estimation, to make memory-limited query fail quite easily.I believe https://github.com/apache/datafusion/issues/13089 is caused by this issue, and likely https://github.com/apache/datafusion/issues/9417 https://github.com/apache/datafusion/issues/10511 https://github.com/apache/datafusion/issues/12136 https://github.com/apache/datafusion/issues/11390
What changes are included in this PR?
Introduced a new
get_record_batch_memory_size()
to avoid double count, by using a internalHashSet
to recognize reused buffers. While @waynexia is working on a comprehensive solution inarrow-rs
https://github.com/apache/arrow-rs/issues/6439, I think it's useful to introduce this temporary fix in DataFusion due to:record_batch.get_array_memory_size()
with memory overcounting, it's non trivial to fix all tests at once (manual memory tracking is tricky, when I was trying to make one external aggregate query to run, it took me a while to figure out why one test case fail after a change)record_batch.get_array_memory_size()
, and add regression tests for memory-limited query bugs. After we have a fix in arrow, the temporary fix function can be deprecated and replace with the origin one more easily.Are these changes tested?
Yes
Are there any user-facing changes?
No