Closed ayakael closed 1 month ago
Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.
Author: | ayakael |
---|---|
Assignees: | - |
Labels: | `area-System.IO` |
Milestone: | - |
Of note, both builds use the backported version of https://github.com/dotnet/runtime/pull/76500. Indeed, without this, mono flavored runtime doesn't even build on musl.
Further investigation reveals a more telling error:
Reading symbols from ./dotnet...
(gdb) run
Starting program: /var/build/dotnet7/community/dotnet7-stage0/src/test/dotnet build
Program received signal SIGILL, Illegal instruction.
copy_stack_data_internal (stackdata_begin=0x7fffffffdb98, info=<optimized out>, wrapper_data1=<optimized out>, wrapper_data2=<optimized out>) at /var/build/dotnet7/community/dotnet7-stage0/src/dotnet-v7.0.101-source-build/src/runtime/src/mono/mono/utils/mono-threads-coop.c:193
193 memcpy (state->gc_stackdata, stackdata_end, stackdata_size);
A similar bug occured in another package I maintain, and was brough to the surface by https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/43463, which makes clang15 enable fortify-source by default. The actual bug is a target mismatch bug introduced in gcc12. It is tracked here: https://gitlab.alpinelinux.org/alpine/aports/-/issues/14105
Applying the following patch does the trick as a workaround:
From 98054ea87ce70247bb09ceafd2ad1a0b36d2fef4 Mon Sep 17 00:00:00 2001
Patch-Source: https://github.com/dotnet/runtime/issues/82269
From: Antoine Martin <dev@ayakael.net>
Date: Sat, 1 Oct 2022 09:21:58 -0400
Subject: [PATCH] Undefine fortify-source on mono-thread-coop
When _FORTIFY_SOURCE=2, there is a bug relating to memcpy that expresses itself.
See: https://gitlab.alpinelinux.org/alpine/aports/-/issues/14105. Alpine Linux
now sets this by default since https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/43463,
which makes mono-flavored runtime dump its core. This patch offers a workaround
by undefining _FORTIFY_SOURCE in the problematic file.
---
diff --git a/src/runtime/src/mono/mono/utils/mono-threads-coop.c b/src/runtime/src/mono/mono/utils/mono-threads-coop.c
index 4ed659d6605..34bb5785fba 100644
--- a/src/runtime/src/mono/mono/utils/mono-threads-coop.c
+++ b/src/runtime/src/mono/mono/utils/mono-threads-coop.c
@@ -15,6 +15,7 @@
#ifdef TARGET_MACH
#define _DARWIN_C_SOURCE
#endif
+#undef _FORTIFY_SOURCE
#include <mono/utils/mono-compiler.h>
#include <mono/utils/mono-threads.h>
copying what i wrote downstream for reference:
it's not; this is an actual fortify violation. clang is also not affected by that inlining issue, which manifests as a compilation failure, not a runtime sigill. that issue is about gcc failing to compile code at all in certain situations with fortify. this one is that the code is wrong, and crashes on the fortify assertion. breaking fortify hits e.g. a ud2 instruction (SIGILL), because the code did something incorrect. so, strictly speaking, in this case it is (probably) not a compiler bug and undefining fortify is not correct. the code itself does something wrong in copy_stack_data_internal with the memcpy. the most common cause is that it fails to uphold the requirement for memcpy: the start and end don't overlap.
from man 3 memcpy
:
DESCRIPTION
The memcpy() function copies n bytes from memory area src to memory
area dest. The memory areas must not overlap. Use memmove(3) if the
memory areas do overlap.
if you change that memcpy to memmove on line 193, it will probably pass. and if you write some sample code where the memory overlaps, you will reproduce the same crash (with either gcc or clang utilising the fortify headers). they're working as intended :)
Changing to memcpy
to memmove
still yields an error:
Reading symbols from ./dotnet...
(gdb) run
Starting program: /var/build/dotnet7/community/dotnet7-stage0/src/test/dotnet build
Program received signal SIGILL, Illegal instruction.
copy_stack_data_internal (stackdata_begin=0x7fffffffd848, info=<optimized out>, wrapper_data1=<optimized out>, wrapper_data2=<optimized out>) at /var/build/dotnet7/community/dotnet7-stage0/src/dotnet-v7.0.101-source-build/src/runtime/src/mono/mono/utils/mono-threads-coop.c:193
193 memmove (state->gc_stackdata, stackdata_end, stackdata_size);
the memmove
_FORTIFY_FN(memmove) void *memmove(void * _FORTIFY_POS0 __d,
const void * _FORTIFY_POS0 __s, size_t __n)
{
size_t __bd = __builtin_object_size(__d, 0);
size_t __bs = __builtin_object_size(__s, 0);
if (__n > __bd || __n > __bs)
__builtin_trap();
return __orig_memmove(__d, __s, __n);
}
__builtin_object_size would return (size_t)-1
if it fails, so then the check would pass. meaning, the assertion is real, and the __n is probably wrong
the memcpy fortify is that, with an extra overlap check.
@lambdageek @vargaz what do you think?
Moving this to 9.0.0, but will consider backporting a fix for the issue
Assigning to @lambdageek for tracking, we'll tackle once back from vacation
Mono's behaviour is intentional: we need to copy a chunk of the stack for the benefit of the GC.
So assuming we're calculating the bounds correctly (which may include the frames of several callers and possibly a red zone - I don't recall), the work here will be to convince FORTIFY_SOURCE that we're doing it on purpose and not to flag us. (I'm not sure if that's possible, so I'd welcome help from someone who has experience with it)
I mean it's also possible we don't calculate the size of the destination buffer correctly. I don't mean to imply we're smarter than the compiler. Just that we're doing something intentionally sus. But maybe we messed it up. We need to investigate
Since we will no longer be shipping desktop configurations for mono (https://github.com/dotnet/docs/issues/41366), it is not likely we will work on this issue.
Description
Building runtime on current edge of Alpine Linux using mono generates a broken runtime.
Reproduction Steps
On Alpine Linux Edge, build a mono-flavored runtime, and try to execute
./dotnet build
. Core dump will occur.Expected behavior
Good execution
Actual behavior
Coredump
Regression?
Bug occurs on dotnet6 as well.
Known Workarounds
Undefining _FORTIFY_SOURCE in
src/mono/mono/utils/mono-thread-coop.c
Configuration
7.0.103 and 6.0.114 Alpine Linux Edge s390x and ppc64le, presumably any runtime built with mono-flavor
Other information
Running
dotnet build
throughgdb
yields the following:Runtimes built previously still work, so something occurs during build for runtime to fail with
src/mono/mono/utils/mono-threads-coop.c
on line 123.