bellard / quickjs

Public repository of the QuickJS Javascript Engine.
https://bellard.org/quickjs
Other
8.51k stars 892 forks source link

libunicode.c:266:12: runtime error: load of misaligned address 0x55999c989af5 for type 'uint16_t', which requires 2 byte alignment #285

Closed darlinghoshi4 closed 6 months ago

darlinghoshi4 commented 6 months ago

version 3b45d15

Build CONFIG_UBSAN=y make qjs

Test case const v15 = String.fromCharCode(1853,862); v15.normalize();

Execution steps ./qjs poc.js

Output 图片

chqrlie commented 6 months ago

The offending code is:

static uint32_t get_le24(const uint8_t *ptr)
{
#if defined(__x86__) || defined(__x86_64__)
    return *(uint16_t *)ptr | (ptr[2] << 16);
#else
    return ptr[0] | (ptr[1] << 8) | (ptr[2] << 16);
#endif
}

This optimisation does technically have undefined behavior if the pointer is misaligned, yet x86 processors do allow unaligned reads so the code works as expected.

We could use a more cumbersome, yet UB free version:

uint32_t get_le24_ok(const uint8_t *ptr)
{
#if defined(__x86__) || defined(__x86_64__)
    uint16_t w16;
    memcpy(&w16, ptr, sizeof w16);
    return w16 | (ptr[2] << 16);
#else
    return ptr[0] | (ptr[1] << 8) | (ptr[2] << 16);
#endif
}

Or if we can assume at least 4 bytes can be read from the source (which is not guaranteed at the moment if reading the last 24 bits of a table):

uint32_t get_le24_fast(const uint8_t *ptr)
{
#if defined(__x86__) || defined(__x86_64__)
    uint32_t w32;
    memcpy(&w32, ptr, sizeof w32);
    return w32 & 0xffffff;
#else
    return ptr[0] | (ptr[1] << 8) | (ptr[2] << 16);
#endif
}

Testing on GodBolt: https://godbolt.org/z/EEMPE5sYc shows that clang generates the same code for the 3 equivalent versions, but not gcc, and both generate optimal code for the get_le24_fast alternative.

For the sake of readability, portability and safe testing, we should probably remove this hack and use the generic code unconditionally.

darlinghoshi4 commented 6 months ago

3Q :)