Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Copy construction performs unsolicited reads of volatile derived-class members in padding #25473

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR25474
Status NEW
Importance P normal
Reported by hstong@ca.ibm.com
Reported on 2015-11-10 09:57:45 -0800
Last modified on 2015-11-10 20:46:44 -0800
Version trunk
Hardware All Linux
CC dgregor@apple.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk, rjmccall@apple.com, rnk@google.com, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
It is observed that Clang generates reads into the tail padding of non-POD
classes, which is unsafe because said tail padding may (in the complete object)
be associated with volatile members.

The following code demonstrates the issue on typical x86 Linux systems. A
segmentation fault does not occur when the members of the A subobject are
"manually" copied, but the code generated by Clang for the copy construction
causes a segmentation fault.

Online compiler: http://melpon.org/wandbox/permlink/91iSbt94RCoqH9d8

### REPRODUCTION SCRIPT:
clang++ -x c++ -std=c++11 -<<'EOF' &&
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <new>

constexpr long pgsz = 4096;

struct A {
  ~A() {}
  struct AA {
    alignas(pgsz * 2) char aa[pgsz * 2];
  } aa;
  struct AB {
    char ab[pgsz];
  } ab;
};

struct B : A {
  struct BA {
    volatile char ba[pgsz];
  } ba;
};

static_assert(sizeof(A) == pgsz * 4, "");
static_assert(sizeof(B) == sizeof(A), "");

int main(void) {
  void *paLHS, *pbLHS;
  if (posix_memalign(&paLHS, alignof(A), sizeof(A)))
    abort();
  if (posix_memalign(&pbLHS, alignof(B), sizeof(B)))
    abort();
  B &bLHS = *new (pbLHS) B;

  int fd = open("pgszX6", O_RDONLY);
  if (fd < 0)
    abort();

  void *pb1 = mmap(0, pgsz * 6, PROT_READ, MAP_PRIVATE, fd, 0);
  if (!pb1)
    abort();

  void *pb0 = reinterpret_cast<void *>(
      (reinterpret_cast<intptr_t>(pb1) + (pgsz * 2 - 1)) / (pgsz * 2) *
      (pgsz * 2));
  B *pb = new (pb0) B;

  if (mprotect(&pb->ba, pgsz, PROT_NONE))
    abort();

  fprintf(stderr, "Page size: %ld\n", sysconf(_SC_PAGESIZE));

  fprintf(stderr, "About to perform assignment member-by-member...\n");
  bLHS.aa = pb->aa;
  bLHS.ab = pb->ab;

  fprintf(stderr, "About to perform copy construction...\n");
  new (paLHS) A(*static_cast<A *>(pb));
}
EOF
dd ibs=$(( 4096 * 6 )) count=1 if=/dev/zero of=pgszX6 && ./a.out

### ACTUAL OUTPUT:
Page size: 4096
About to perform assignment member-by-member...
About to perform copy construction...
prog.sh: line 64:    30 Segmentation fault      ./a.out

### EXPECTED OUTPUT (no segmentation fault):
Page size: 4096
About to perform assignment member-by-member...
About to perform copy construction...

### COMPILER VERSION INFO (clang++ -v):
clang version 3.8.0 (trunk 252469)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/llvm-head/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.3
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
Quuxplusone commented 9 years ago
I think the read of tail padding here is permitted (but I'd like John to
confirm); under our (implementation-defined) model for what constitutes a
volatile access, I believe this is not one.

However:

1) it seems really dumb that we're copying 4096 bytes more than we need to.
2) this causes miscompiles in related cases, such as (for x86_64):

struct A {
  ~A() {}
  void *p;
  int n;
};
struct B : A {
  int m = 1;
} b;
struct C : A {
  C() : A(b) {}
  int m;
};
struct D : C {};
D d = D();
int main() { return d.m; }

This is required to result in d.p = nullptr, d.n = 0, d.m = 0, because D()
performs zero-initialization prior to invoking the default constructor of D.
However, we actually initialize d.m to 1, because we memcpy the tail-padding of
the A object when constructing C's base class.

So, we *must not* copy tail padding, at least when emitting a (trivial) base
subobject copy/move constructor. The !tbaa.struct metadata is not sufficient to
avoid the problem, because it is discardable (and indeed is ignored in this
case).

Also of note: the use of pgsz within the alignas causes us to assert (but that
is unrelated to this issue, and looks like bug#13986).
Quuxplusone commented 9 years ago
I certainly agree that Richard's example is a miscompile; when constructing a
base-class subobject, we need to not write more bytes than the base-subobject
size.  (I'm quite surprised that we have bugs here, actually; I thought I
remembered fixing this.  Maybe there was a case I missed, or which I failed to
test adequately and was broken later.)

The original example is interesting.  Writing the complete-object size to the
destination is certainly legal, so the remaining question is whether it's legal
to perform a read from the source that might overlap a volatile object in a
subclass.  I'm pretty sure it's legal according to the memory model; the read
doesn't interfere with concurrent accesses on any architecture I'm aware of,
and we don't care about it being interfered with by other concurrent accesses
because it's only used to initialize padding.  Volatile requires us to perform
volatile accesses exactly as given, but it's arguable that that shouldn't
constrain us from performing additional accesses that aren't observable under
the standard and under our declared rules of implementation-defined volatile
behavior.  memprotecting random chunks of the heap is well outside of the
standard, and I agree that our implementation-defined behavior doesn't have to
honor memory-mapping tricks internal to an object.

Normally we're very conservative about optimizing around explicit uses of
volatile, but that's not the case here; here we're being asked to be more
conservative purely because there might be an unknown use of volatile.

And this is an important optimization.  Well, no, in this example it's clearly
a pessimization, of course, but that's just an artifact of the enormous
alignment here; imagine e.g. copying a structure like this:
  struct NonPOD { ~NonPOD(); void *ptr; char array[sizeof(void*) - 1]; };
Having to copy exactly 7 (or 15) bytes instead of 8 (or 16) just because the
source might be a base subobject of a class that begins with a volatile char
would be pretty awful.

All that said, I would be happy to help come up with a rule that tried to
simultaneously address the pessimization and the mis-compile for pages internal
to a structure.  For example, we could copy
  min(<complete-object size>, roundUpToAlignment(<base-subobject size>, min(<base-subobject alignment>, max(<target max vector size>, <target pointer size>)))
This would generally not roll over into new pages allocated in subclasses, and
it would very strictly bound the number of extra bytes the copy would perform.