TheAssemblyArmada / Vanilla-Conquer

Vanilla Conquer provides clean, cross-platform builds of the C&C Remastered Collection and the standalone legacy games.
Other
330 stars 51 forks source link

Avoid unaligned access to Shape_Type->width #982

Open th-otto opened 3 months ago

giulianobelinassi commented 1 month ago

is this really necessary? DSi port has the same issue (unaligned access) but I did not need to patch this function in order to get things to work. What exactly does this fix?

xezon commented 1 month ago

Unaligned access is undefined behaviour and it will crash on some hardware.

th-otto commented 1 month ago

You might be right, Shape_Type is declared as packed(1), and the compiler should already access the Width member with byte accesses on architectures that need this. OTOH, it should not hurt, as the memcpy of 2 bytes is optimized anyway.

Simple test: both

uint16_t test(Shape_Type *t)
{
    return t->Width;
}

and

uint16_t test(Shape_Type *t)
{
    uint16_t w;

    memcpy(&w, &t->Width, sizeof(w));
    return w;
}

produce the same code on x86.

giulianobelinassi commented 1 month ago

You might be right, Shape_Type is declared as packed(1), and the compiler should already access the Width member with byte accesses on architectures that need this. OTOH, it should not hurt, as the memcpy of 2 bytes is optimized anyway.

Simple test: both

uint16_t test(Shape_Type *t)
{
  return t->Width;
}

and

uint16_t test(Shape_Type *t)
{
  uint16_t w;

  memcpy(&w, &t->Width, sizeof(w));
  return w;
}

produce the same code on x86.

I Will check this on ARM9 when I have the opportunity. Sorry but the compiler always remove those memcpys on x86_64, so this arch is not ideal for this kind of test

xezon commented 1 month ago

On ARM this may error as highlighted by this article:

https://www.alfonsobeato.net/arm/how-to-access-safely-unaligned-data/

giulianobelinassi commented 1 month ago

This indeed inserts a memcpy call: https://godbolt.org/z/77W3o5M3P

Although this won't error out since memcpy is guaranteed to read the bytes in the correct order if the pointers do not alias, it still keeps the memcpy call.

The solution presented by you using templates seems to be the best approach, tho. Perhaps we should refactor the code to use this approach instead.