Open 54aefcd4-c07d-4252-8441-723563c8826f opened 8 years ago
Yes, p0137 has landed. I think you're right that we could probably support a local conservative analysis for this, perhaps as part of ubsan.
Richard, did P0137 land yet?
I was thinking that if it did land, one could at implement a clang warning for a single TU. That doesn't cover all cases (neither cross TUs nor run-time), but one of the most common uses of accessing an inactive member of a union is to perform type punning, and often enough this is done "inline" (inside the same block), and unconditionally (without branches), so clang could at least warn in these cases.
That is something that clang might want to do independently of trying to catch all the issues with a sanitizer or cross TU static analysis in the future.
This is currently unimplementable because the C++ standard is hopelessly unclear on what the active member of a union is and how to change it. Once P0137 lands (see wg21.link/p0137r0) we'll have rules that can be used to model this.
Note that this will /not/ be part of UBSan: it needs to be a separate sanitizer because (as far as we are aware) it cannot be implemented in a way that can be applied to individual TUs and without invasive changes to the memory layout of the program. We need shadow memory or something similar to track active union members, and we need the entire program to be sanitized in order to catch all the cases that change the active member.
Extended Description
Just discussed in this range-v3 issue: https://github.com/ericniebler/range-v3/issues/239
both Eric and Casey give good examples. Here Eric's example:
template<typename F, typename S> struct pair_data { union { F non_constfirst; F const first; }; union { S non_constsecond; S const second; }; };
template<typename F, typename S> struct pair : private pair_data<F, S> { using pair_data<F, S>::first; using pair_data<F, S>::second; pair() : pair_data<F, S>{} {} pair(F f, S s) : pair_data<F, S>{f, s} {} ~pair() { first.~F(); second.~S(); } };
int main() { pair<int, float> f{1, 3.14f}; return f.first; // UB // more UB: destructors of the non-active members (first, second) are called // instead of the destructors of the active members (first, second) }
Aggregate initialization of the union initialize the first member "first_". Accessing first thus access a non-active member of the union and results in undefined behavior.