apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
13.7k stars 3.34k forks source link

GH-41460: [C++] Use ASAN to poison temp vector stack memory #41695

Closed zanmato1984 closed 2 weeks ago

zanmato1984 commented 2 weeks ago

Rationale for this change

See #41460. And reduce the overhead of current manual poisoning (filling the entire stack space with 0xFFs) that happens even in release mode.

What changes are included in this PR?

Use ASAN API to replace the current manual poisoning of the temp vector stack memory.

Are these changes tested?

Wanted to add cases to assert that ASAN poison/unpoison is functioning. However I found it too tricky to catch an ASAN error because ASAN directly uses signals that are hard to interoperate in C++/gtest. So I just manually checked poisoning is working in my local, e.g. by intentionally not unpoisoning the allocated buffer and seeing ASAN unhappy.

Just leveraging existing cases that use temp stack such as acero tests, which should cover this change well.

Are there any user-facing changes?

None.

github-actions[bot] commented 2 weeks ago

:warning: GitHub issue #41460 has been automatically assigned in GitHub to PR creator.

zanmato1984 commented 2 weeks ago

Hi @felipecrv , this is a followup of our discussion in https://github.com/apache/arrow/pull/41335#discussion_r1583226550. Think you might want to take a look. Thanks.

felipecrv commented 2 weeks ago

Thank you for doing this! ASAN is so cool.

conbench-apache-arrow[bot] commented 2 weeks ago

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit dcdf4e6953b7fdab6078c444c8d07a606750fec1.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 9 possible false positives for unstable benchmarks that are known to sometimes produce them.