apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.64k stars 3.55k forks source link

[C++] Potential memory leak at shutdown if an exec plan with a scanner fails or is aborted immediately before shutdown #20338

Open asfimport opened 2 years ago

asfimport commented 2 years ago

I'm primarily creating this so we can remember to make a test for this. This problem should be solved as part of ARROW-16072. When the scanner fails it simply discards references to the various scanner AsyncGenerators. However, some I/O tasks may still have references to these generators and so some parts of the scanner survive after the plan itself is marked complete. If there is an immediate shutdown then these parts will not be properly disposed of even though the plan is marked complete and it will show up as a memory leak.

Example:

https://pipelines.actions.githubusercontent.com/serviceHosts/8bb0d999-3387-4c48-9fa6-c66c718a46e2/_apis/pipelines/1/runs/359690/signedlogcontent/4?urlExpires=2022-07-25T14%3A43%3A01.2797488Z&urlSigningMethod=HMACV1&urlSignature=GS3lS09Q9sTRweN%2B8UEu2GwUGc%2FbO9eyH27FRKumbrg%3D

Reporter: Weston Pace / @westonpace

Note: This issue was originally created as ARROW-17198. Please see the migration documentation for further details.

asfimport commented 2 years ago

Weston Pace / @westonpace: I'm able to reproduce this by compiling with ASAN, using stress to make the CPU busy, and running the following command:


while taskset -c 0,1 ./debug/arrow-dataset-scanner-test --gtest_filter=TestScannerThreading/TestScanner.FromReader/3Threaded2d16b1024r; do sleep 0.1; done