ARM-software / optimized-routines

Optimized implementations of various library functions for ARM architecture processors
Other
587 stars 94 forks source link

memcmp page overlap segfault bug #60

Closed jart closed 1 year ago

jart commented 1 year ago

Optimized Routines' memcmp() function appears to have a bug where it'll overlap a page boundary inappropriately when performing lookahead. The test I'm using which triggers this is:

TEST(memcmp, pageOverlapTorture) {
  long pagesz = sysconf(_SC_PAGESIZE);
  char *map = mmap(0, pagesz * 2, PROT_READ | PROT_WRITE,
                   MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
  char *map2 = mmap(0, pagesz * 2, PROT_READ | PROT_WRITE,
                    MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
  ASSERT_SYS(0, 0, mprotect(map + pagesz, pagesz, PROT_NONE));
  ASSERT_SYS(0, 0, mprotect(map2 + pagesz, pagesz, PROT_NONE));
  strcpy(map + pagesz - 9, "12345678");
  strcpy(map2 + pagesz - 9, "12345679");
  EXPECT_LT(memcmp(map + pagesz - 9, map2 + pagesz - 9, 79), 0);
  EXPECT_SYS(0, 0, munmap(map2, pagesz * 2));
  EXPECT_SYS(0, 0, munmap(map, pagesz * 2));
}

To give an example of how an implementation can work around this issue, here's what I came up with for x86:

int memcmp(const void *a, const void *b, size_t n) {
  int c;
  const unsigned char *p, *q;
  if ((p = a) == (q = b) || !n) return 0;
  if ((c = *p - *q)) return c;
#if defined(__x86_64__) && !defined(__chibicc__)
  unsigned u;
  while (n >= 16 && (((uintptr_t)p & 0xfff) <= 0x1000 - 16 &&
                     ((uintptr_t)q & 0xfff) <= 0x1000 - 16)) {
    if (!(u = __builtin_ia32_pmovmskb128(*(xmm_t *)p == *(xmm_t *)q) ^
              0xffff)) {
      n -= 16;
      p += 16;
      q += 16;
    } else {
      u = __builtin_ctzl(u);
      return p[u] - q[u];
    }
  }
#endif /* __x86_64__ */
  for (; n; ++p, ++q, --n) {
    if ((c = *p - *q)) {
      return c;
    }
  }
  return 0;
}
Wilco1 commented 1 year ago

memcmp doesn't have to worry about page boundaries, so the current behaviour is expected and correct. Why are you trying to write memcmp in C? That's never going to be as fast as assembler implementations like in GLIBC.

nsz-arm commented 1 year ago
EXPECT_LT(memcmp(map + pagesz - 9, map2 + pagesz - 9, 79), 0);

only 9 bytes of the objects can be accessed, not 79. so this call is invalid: memcmp may access all bytes within the specified bounds (although iso c wording is not very clear about such details).

i think writing this in c is fine, but technically *(xmm_t *)p is aliasing violation, it needs a may_alias attribute or memcpy to an xmm_t object first and hoping the compiler will be smart.

closing this now.