merces / libpe

The PE library used by @merces/pev
http://pev.sf.net
GNU Lesser General Public License v3.0
115 stars 40 forks source link

what is `#pragma pack` when should I use this in my C library? #15

Closed boddumanohar closed 7 years ago

boddumanohar commented 7 years ago

Since I am new to C projects, I am having a doubt when should I use #pragma pack. In all the header files we have used #pragma pack(push, <somenumber> and #pragma pack(pop)

I've read this blog https://msdn.microsoft.com/en-us/library/2e70t5y1.aspx even still my doubt is not clarified. Can you please explain in simple terms

  1. When should I use this?
  2. What if I dont use this?

Thanks, Manohar.

jweyrich commented 7 years ago

The processor can naturally access (read/write) memory aligned data, but non-aligned data requires extra CPU cycles in order to do memory shifting. As an example, x86 (32-bit) processors can naturally read/write 4 bytes (32-bit). It will always read or write 4 bytes, even if you're trying to read or write 1 single byte (exceptions for some instructions like SIMD). So if a struct member is not aligned, and it wants to change its value, the processor spends these extra cycles to shift the data left and right. It's a tradeoff between performance and memory usage.

#pragma pack is a directive used to tell the compiler how to align struct/class members in memory. The compiler uses a default alignment, which varies depending on the architecture and/or compilation flags. You can override the default alignment by using #pragma pack.

Let's say you have typedef struct { char a; int b; } foo_t;. According to the language definition, sizeof char MUST be 1, and sizeof int varies depending on the architecture/compiler combination. On 32-bit and 64-bit architectures, its natural size is 4 bytes, enough to contain values in the range specified by INT_MIN and INT_MAX as defined in the header <limits.h>.

So given that sizeof char == 1 and sizeof int == 4, you'd expect sizeof foo_t to be 5, right? We assume that the size of a struct should be equal to the sum of the size of its members. But that's not always true. In fact, sizeof foo_t above will most likely be 8 instead of 5. This behavior is due to a memory padding the compiler added to the struct, inserted between the 2 members, a (1 byte) and b (4 bytes), so both members are aligned.

Now back to why we need #pragma pack - The PE/COFF specification tells us exactly how the header members are laid out (in disk, or memory), and we have some scenarios like foo_t above where members are not aligned, therefore we have to "tell the compiler" we want to align data differently - In other words, we don't want paddings because the PE specification tells us that there are no paddings between data members. That is the purpose of #pragma pack.

I'd write some examples, but someone already did it - and may have explained it better than me, so I suggest you to also read the following answer - https://stackoverflow.com/a/3318475/298054

If you have further questions regarding the C language, I'd recommend you to post them on Stack Overflow (stackoverflow.com) - If you're not familiarized with Stack Overflow, it's a Q&A site and the C community there is quite strong and you'll likely get multiple answers to your question. Just be sure to make a quick search before posting, otherwise they'll close it as Duplicate (which is not a punishment, it's just they way of saying "we already answered that").

Hope this helps!

footnote: I'm just a regular user @ stackoverflow - I don't receive any $ from them.

boddumanohar commented 7 years ago

So when I use #pragma pack the size of the above struct will be 5. So we use this to selectively disable padding. And we need to do this here because PE specification tells us that there are no padding between data members.

Clarifed. A thousand thanks for such a clear explanation :)