windelbouwman / ppci

A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python
https://ppci.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
335 stars 35 forks source link

ppci-cc: alignment problem for vararg functions arguments #97

Open tstreiff opened 4 years ago

tstreiff commented 4 years ago

When calling a vararg function (like printf), the IR generator allocates a memory block on the stack and fills it with the variable arguments. In doing so, padding is inserted when needed to cover the alignement constraints.

The callee receives the memory block and uses a pointer to read the expected type, then increments the pointer with the size of the expected type, so it does not handle padding.

This does not work in all cases where padding is inserted by the caller since it is ignored by the callee.

Typical case that does not work (and crashes most of the time):

int i; char *pc; printf("%d %s", i, pc);

For x86_64, this creates a 16byte block filled as follows;

Printf uses va_arg(int) then va_arg(char *) and will not skip any padding:

Two solutions: 1) Either the caller never uses any padding 2) Or arguments are all aligned on the strongest alignement contraint (8byte on x86_64)

Solution 2) is the only that is compliant with the alignment constraints. x86 is tolerant towards misaligned data but other architectures are much more sensitive.

The strongest alignment constraint could be computed once (and put in the context) by taking the strongest alignment among int, long, and pointer types. The information would then be used in varrag callee and caller IR generation.

windelbouwman commented 4 years ago

Thanks for this detailed analysis!

I'm at the moment not very satisfied how vararg is implemented. It is by no means compatible with linux x86_64 printf for example.

I had a look at this document: https://web.archive.org/web/20160801075139/http://www.x86-64.org/documentation/abi.pdf

Seems like the va_list type is a specific thing per architecture.

One way to handle this properly would be to add additional IR-code instructions I assume, and have a sort of polyfill method which falls back to this old method.