Unfortunately this doesn't address the actual problem with creating giant batches, which is they require a lot of memory and that memory isn't accounted for in any MemoryPool. Wiring a MemoryReservation into BatchSplitter would probably be enough to address this though.
Describe the solution you'd like
I would like the memory accounting to take into account the large output batch
Describe alternatives you've considered
Wiring a MemoryReservation into BatchSplitter would probably be enough to address
Is your feature request related to a problem or challenge?
Follow on to https://github.com/apache/datafusion/pull/12969 and https://github.com/apache/datafusion/issues/12633
In https://github.com/apache/datafusion/issues/12633 @mhilton noted that joins sometimes generate giant record batches which causes issues. @alihan-synnada fixed this in https://github.com/apache/datafusion/pull/12969 but internally sometimes the joins still generate giant output batches.
As @mhilton says in https://github.com/apache/datafusion/pull/12969#issuecomment-2418862655
Describe the solution you'd like
I would like the memory accounting to take into account the large output batch
Describe alternatives you've considered
Wiring a MemoryReservation into BatchSplitter would probably be enough to address
Additional context
No response