Closed wkozaczuk closed 1 year ago
Wow, good catch. I can confirm that this test also fails on my Fedora 36, with gcc 12.2.1.
I confirmed that if memcpy_repmov_ssse3() is modified t not call small_memcpy(), the bug indeed doesn't happen - so that function is indeed to blame. I'll try to think how it may misscompile.
I was worried that maybe the fancy casting does something wrong (some "strict aliasing rule violation", as usual), but bizarrely, it seems to me now that the compiler actually recognizes it as a case of copying, calls __builtin_memcpy(), and it is that __builtin_memcpy() which is broken! The following trivial C++ code returns the wrong result:
#include <cstdio>
#include <cstring>
int main(){
char bufa[128] = "0123456789abcdefghijklmnopqrstuvwxyz";
char bufb[128] = "0123456789abcdefghijklmnopqrstuvwxyz";
memcpy(bufa, bufa+1, 27);
printf(" memcpy: %s\n", bufa);
__builtin_memcpy(bufb, bufb+1, 27);
printf("__builtin_memcpy: %s\n", bufb);
}
This should print the same value twice, but it's different in gcc 12.2.1!
I'll open a bug report for gcc, but I'll also try to think how we can avoid work around this bug, if we can at all. At worst case we can detect if the bug exists and if so have small_memcpy() just call sse_memcpy() (or have its callers don't call small_memcpy().
The resolution of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108296 as a duplicate led me to two very interesting insights:
Oops, I just realized - our caller DOES use memmove(). This memmove() just calls our memcpy() in cases when it thinks it is safe. So the mis-compilation is a problem :-(
The following hacky patch fixed the new memmove test and the Java test: (only the repmove_ssse3 version is actually necessary on a modern CPU)
diff --git a/arch/x64/string.cc b/arch/x64/string.cc
index b3782fe7..571049f9 100644
--- a/arch/x64/string.cc
+++ b/arch/x64/string.cc
@@ -220,9 +220,16 @@ extern "C"
[[gnu::optimize("omit-frame-pointer")]]
void *memcpy_repmov_old(void *__restrict dest, const void *__restrict src, size_t n)
{
+// Gcc 12 miscompiles our small_memcpy to use __builtin_memcpy() instead
+// of memmove, so fails for overlapping regions that this function is supposed
+// to correctly support. See https://github.com/cloudius-systems/osv/issues/1212
+#if __GNUC__ == 12
+ if (n < 1024) {
+#else
if (n < small_memcpy_lim) {
return small_memcpy(dest, src, n);
} else if (n < 1024) {
+#endif
return sse_memcpy(dest, src, n);
} else {
auto ret = dest;
@ -240,9 +247,16 @@ extern "C"
[[gnu::optimize("omit-frame-pointer")]]
void *memcpy_repmov(void *__restrict dest, const void *__restrict src, size_t n)
{
+// Gcc 12 miscompiles our small_memcpy to use __builtin_memcpy() instead
+// of memmove, so fails for overlapping regions that this function is supposed
+// to correctly support. See https://github.com/cloudius-systems/osv/issues/1212
+#if __GNUC__ == 12
+ if (n < 1024) {
+#else
if (n < small_memcpy_lim) {
return small_memcpy(dest, src, n);
} else if (n < 1024) {
+#endif
return sse_memcpy(dest, src, n);
} else {
auto ret = dest;
@@ -255,9 +269,16 @@ extern "C"
[[gnu::optimize("omit-frame-pointer")]]
void *memcpy_repmov_old_ssse3(void *__restrict dest, const void *__restrict src, size_t n)
{
+// Gcc 12 miscompiles our small_memcpy to use __builtin_memcpy() instead
+// of memmove, so fails for overlapping regions that this function is supposed
+// to correctly support. See https://github.com/cloudius-systems/osv/issues/1212
+#if __GNUC__ == 12
+ if (n < 1024) {
+#else
if (n < small_memcpy_lim) {
return small_memcpy(dest, src, n);
} else if (n < 1024) {
+#endif
return sse_memcpy(dest, src, n);
} else if (n < 65536 && !both_aligned(dest, src, 16)) {
ssse3_unaligned_copy(dest, src, n);
@@ -278,9 +299,16 @@ extern "C"
[[gnu::optimize("omit-frame-pointer")]]
void *memcpy_repmov_ssse3(void *__restrict dest, const void *__restrict src, size_t n)
{
+// Gcc 12 miscompiles our small_memcpy to use __builtin_memcpy() instead
+// of memmove, so fails for overlapping regions that this function is supposed
+// to correctly support. See https://github.com/cloudius-systems/osv/issues/1212
+#if __GNUC__ == 12
+ if (n < 1024) {
+#else
if (n < small_memcpy_lim) {
return small_memcpy(dest, src, n);
} else if (n < 1024) {
+#endif
return sse_memcpy(dest, src, n);
} else if (n < 65536 && !both_aligned(dest, src, 16)) {
ssse3_unaligned_copy(dest, src, n);
diff --git a/tests/tst-memmove.cc b/tests/tst-memmove.cc
index 36b75a41..4520a781 100644
--- a/tests/tst-memmove.cc
+++ b/tests/tst-memmove.cc
@@ -202,6 +202,10 @@ int main()
memmove_test(destOffset, srcOffset, n);
}
+ char buf10[128] = "/usr/lib/jvm/java/bin/java";
+ memmove(buf10, buf10 + 1, 27);
+ pass_if(buf10, "usr/lib/jvm/java/bin/java", 25);
+
std::cerr << "PASSED\n";
return 0;
}
Unfortunately, scripts/run.py -e 'tests/tst-commands.so' -c 1
still fails, I need to check why.
One more test would fail in my case:
TEST tst-feexcept.so OSv v0.57.0-7-g07a8b81e
eth0: 192.168.122.15
Booted up in 153.32 ms
Cmdline: /tests/tst-feexcept.so
0
FAIL: /git-repos/osv/tests/tst-feexcept.cc:82: For sig_check([] { printf("%d\n", 1 / zero_i()); }, SIGFPE), expected true, saw 0.
SUMMARY: 14 tests, 1 failures
Test tst-feexcept.so FAILED
Traceback (most recent call last):
File "/git-repos/osv/./scripts/test.py", line 191, in <module>
main()
File "/git-repos/osv/./scripts/test.py", line 147, in main
run_tests()
File "/git-repos/osv/./scripts/test.py", line 136, in run_tests
run(tests_to_run)
File "/git-repos/osv/./scripts/test.py", line 109, in run
run_test(test)
File "/git-repos/osv/./scripts/test.py", line 84, in run_test
test.run()
File "/git-repos/osv/scripts/tests/testing.py", line 38, in run
run_command_in_guest(self.command, hypervisor=self.hypervisor, run_py_args=self.run_py_args).join()
File "/git-repos/osv/scripts/tests/testing.py", line 186, in join
raise Exception('Guest failed')
Exception: Guest failed
But the other ones seem to pass:
./scripts/test.py -d tst-commands.so -d tst-feexcept.so
...
OK (140 tests run, 186.646 s)
The test_loader_parse_cmdline()
is calling memset()
. Wild guess: maybe we have a similar issue like with memmove()
?
As I have been investigating the failing java tests on Fedora 37, I have discovered weird failures in
realpath()
when called from the dynamic linker:After more debugging I realized the call to
memmove
in ourrealpath
implementation yields weird results and more specifically this test added totst-memmove.cc
fails on Fedora 37 with gcc 12:The same test compiled and executed on Fedora 35 (gcc 11) works just fine.
My suspicion is that gcc 12 compiles relevant code in
arch/x64/string.cc
probably incorrectly or we have a bug that comes to light right now. More specifically this function called indirectly bymemmove
probably yields incorrect results: