madler / zlib

A massively spiffy yet delicately unobtrusive compression library.
http://zlib.net/
Other
5.58k stars 2.43k forks source link

uInt and int usage causing alignment faults in aarch64 #874

Closed christopherpow closed 10 months ago

christopherpow commented 10 months ago

When compiling and running this source in aarch64 there are data alignment exceptions that appear to be caused by the following:

zconf.h: typedef unsigned int uInt

and three "int" type variables in deflate.h: int heap[2*L_CODES+1]; int heap_len; int heap_max;

At places in the disassembly the code is trying to store a 64-bit register value at a 32-bit aligned (0x4 or 0xC) address, causing the alignment exception. As an aside, I am yet not sure why it's picking 64-bit register x4 not w4. For example:

stur x4, [x19, 180] // s->max_lazy_match = configuration_table[s->level].max_lazy; x9 contains an 8-byte aligned address. 180 is the offset of max_lazy_match, which is 0xB4 thus not 8-byte aligned.

Changing the "int" to "long" works, but I'm not convinced it is an appropriate change.

madler commented 10 months ago

What compiler are you using? (I have no issues with clang on an aarch64 Apple M1.)

christopherpow commented 10 months ago

What compiler are you using? (I have no issues with clang on an aarch64 Apple M1.)

gcc version 11.2.0

madler commented 10 months ago

I'm not finding a gcc 11 on the compile farm I use, so I can't test this.

While I am reluctant to jump to such conclusions, I'm going to claim compiler bug here. The code is getting a short (16-bit) value and storing it in an int (32-bit value) in a structure. The compiler chose to put that 32-bit value on an odd 32-bit boundary, which is perfectly fine, and then that same compiler chose to attempt to store a 64-bit value there for no apparent reason. I don't see anything wrong with the code. There are no pointer-casting shenanigans there. Just straight C structure access.

christopherpow commented 10 months ago

I agree it is strange that it'd be using 64-bit. I tried with optimizations off and it doesn't do that, so perhaps I will weigh whether to make the mods I proposed in the first, or play around with optimization settings to see what if anything other than -O0 works. I saw the note about the Irix compiler optimization bug which sent me down that rabbit hole. Thanks for the quick response!

madler commented 10 months ago

You could also try a more recent version of gcc. They're at 13 now.