Closed darlinghoshi4 closed 6 months ago
The offending code is:
static uint32_t get_le24(const uint8_t *ptr)
{
#if defined(__x86__) || defined(__x86_64__)
return *(uint16_t *)ptr | (ptr[2] << 16);
#else
return ptr[0] | (ptr[1] << 8) | (ptr[2] << 16);
#endif
}
This optimisation does technically have undefined behavior if the pointer is misaligned, yet x86 processors do allow unaligned reads so the code works as expected.
We could use a more cumbersome, yet UB free version:
uint32_t get_le24_ok(const uint8_t *ptr)
{
#if defined(__x86__) || defined(__x86_64__)
uint16_t w16;
memcpy(&w16, ptr, sizeof w16);
return w16 | (ptr[2] << 16);
#else
return ptr[0] | (ptr[1] << 8) | (ptr[2] << 16);
#endif
}
Or if we can assume at least 4 bytes can be read from the source (which is not guaranteed at the moment if reading the last 24 bits of a table):
uint32_t get_le24_fast(const uint8_t *ptr)
{
#if defined(__x86__) || defined(__x86_64__)
uint32_t w32;
memcpy(&w32, ptr, sizeof w32);
return w32 & 0xffffff;
#else
return ptr[0] | (ptr[1] << 8) | (ptr[2] << 16);
#endif
}
Testing on GodBolt: https://godbolt.org/z/EEMPE5sYc shows that clang generates the same code for the 3 equivalent versions, but not gcc, and both generate optimal code for the get_le24_fast
alternative.
For the sake of readability, portability and safe testing, we should probably remove this hack and use the generic code unconditionally.
3Q :)
version
3b45d15
Build
CONFIG_UBSAN=y make qjs
Test case
const v15 = String.fromCharCode(1853,862); v15.normalize();
Execution steps
./qjs poc.js
Output