llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.84k stars 11.47k forks source link

struct array duplication #16839

Open llvmbot opened 11 years ago

llvmbot commented 11 years ago
Bugzilla Link 16465
Version 3.2
OS Linux
Attachments test case, generated bitcode
Reporter LLVM Bugzilla Contributor

Extended Description

A struct:

struct s_t {
    int a;
    int b;
    int c;
    s_t *next;
    int d;
    int e;
    char arr[4];
};

gets translated into (-O0):

%struct.s_t = type { i32, i32, i32, %struct.s_t*, i32, i32, [4 x i8], [4 x i8] }

with two arrays (arr) instead of one. Clang creates:

%struct.s_t = type { i32, i32, i32, %struct.s_t*, i32, i32, [4 x i8] }
llvmbot commented 11 years ago

OK, maybe I can tweak the logic to not add this padding since it isn't useful.

llvmbot commented 11 years ago

Ok thanks, I wasn't sure about it. As user one already has to deal with the ABI and type changes (struct splitting, byval, sret...) but a completely new type in the signature is kind of awkward esp. with getElementOffset()-functionality in the StructLayout class of llvm. I use files which contain the function signatures for annotation purposes - this means clang and dragonegg need different files. :(

llvmbot commented 11 years ago

Those look like padding bytes, used to increase the size of the struct to the size GCC says should be used for it. I'm sure you'd get the same if you changed the definition of arr to int arr;, i.e. it is not a duplicated field, it's an extra field.

More precisely, what is probably happening is this:

1) You are on a 64 bit machine, thus "next" is 8 bytes wide and must be aligned on a multiple of 8 bytes, and causes the whole struct to require 8 byte alignment.

2) The fields are laid out in memory as follows:

  a : bytes 0 ... 3
  b : bytes 4 ... 7
  c : bytes 8 ... 11
  next : bytes 16 ... 23
  d : bytes 24 ... 27
  e : bytes 28 ... 31
  f : bytes 32 ... 35

Note the gap between c and next, due to next needing to start at a multiple of 8 bytes.

3) As given above, the struct has a size of 36 bytes. GCC always rounds struct sizes up to a multiple of the alignment (LLVM does too, as long as the struct is not packed). The alignment being 8, it wants a size of 40. Dragonegg is making the size explicit by adding an extra fake field to the LLVM type:

  padding : bytes 36 ... 39

This is completely harmless.

The type created by Clang is also 40 bytes wide, however it didn't make the padding explicit.