Closed ayakael closed 2 years ago
Of note, crossbuilt linux-musl-s390x
sdk is available here: https://repo.gpg.nz/apk/archives/dotnet-sdk-7.0.100-rc.1.22431.12-linux-musl-s390x.tar.gz
The following piece of code manages the error: https://github.com/dotnet/roslyn/blob/626029155cd455cfef624148e99d246225613800/src/Compilers/Core/Portable/InternalUtilities/StackGuard.cs#L1-L33
Output of ~/dotnet7/testing/dotnet7-stage0/src/bootstrap/dotnet build Src/Newtonsoft.Json/Newtonsoft.Json.csproj /v:diag
newtonsoft.log
According to Newtonsoft, this is indeed most likely a roslyn issue: https://github.com/JamesNK/Newtonsoft.Json/issues/2744#issuecomment-1264669779_
What is the nature of this environment? I only see 20 levels of recursion in this tree. Does this system only have very small stack sizes or something?
Looking into it further, it looks like musl libc has a very small (128k) default thread stack size, contrary to glibc's usual 8MB. I suppose then its a matter of setting a larger thread stack size via linker flag -Wl,-z,stack-size=1024768
. Is there anywhere specific where stack size could be set?
Thinking this to be a stack issue, I set the default stack to bootstrap cli via find "$_cli_root" -type f -exec "$srcdir"/muslstack -s 0x800000 '{}' \; 2>&1 | grep stackSize
which outputs
./sdk/6.0.109/AppHostTemplate/apphost: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libhostpolicy.so: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libcoreclr.so: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libSystem.IO.Compression.Native.so: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libSystem.Native.so: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libSystem.Net.Security.Native.so: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libSystem.Security.Cryptography.Native.OpenSsl.so: stackSize: 0x800000
./shared/Microsoft.NETCore.App/6.0.9/libSystem.Globalization.Native.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Host.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libnethost.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Host.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/apphost: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libhostpolicy.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libcoreclr.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Globalization.Native.so.dbg: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.IO.Compression.Native.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Net.Security.Native.so.dbg: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.IO.Compression.Native.so.dbg: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Security.Cryptography.Native.OpenSsl.so.dbg: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Native.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Native.so.dbg: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Net.Security.Native.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Security.Cryptography.Native.OpenSsl.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libSystem.Globalization.Native.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libhostfxr.so: stackSize: 0x800000
./packs/Microsoft.NETCore.App.Runtime.linux-musl-s390x/6.0.9/runtimes/linux-musl-s390x/native/libcoreclr.so.dbg: stackSize: 0x800000
./host/fxr/6.0.9/libhostfxr.so: stackSize: 0x800000
./dotnet: stackSize: 0x800000
Unfortunately, build of newtonsoft-json still fails when building dotnet with source-build
muslstack is a utility that allows one to set the stack size of prebuilt binary.
Seems more and more like a runtime issue. The following code exists or coreclr but not mono:
#ifdef ENSURE_PRIMARY_STACK_SIZE
/*++
Function:
EnsureStackSize
Abstract:
This fixes a problem on MUSL where the initial stack size reported by the
pthread_attr_getstack is about 128kB, but this limit is not fixed and
the stack can grow dynamically. The problem is that it makes the
functions ReflectionInvocation::[Try]EnsureSufficientExecution#ifdef ENSURE_PRIMARY_STACK_SIZE
/*++
Function:
EnsureStackSize
Abstract:
This fixes a problem on MUSL where the initial stack size reported by the
pthread_attr_getstack is about 128kB, but this limit is not fixed and
the stack can grow dynamically. The problem is that it makes the
functions ReflectionInvocation::[Try]EnsureSufficientExecutionStack
to fail for real life scenarios like e.g. compilation of corefx.
Since there is no real fixed limit for the stack, the code below
ensures moving the stack limit to a value that makes reasonable
real life scenarios work.
--*/
__attribute__((noinline,NOOPT_ATTRIBUTE))
void
EnsureStackSize(SIZE_T stackSize)
{
volatile uint8_t *s = (uint8_t *)_alloca(stackSize);
*s = 0;
}
#endif // ENSURE_PRIMARY_STACK_SIZEStack
to fail for real life scenarios like e.g. compilation of corefx.
Since there is no real fixed limit for the stack, the code below
ensures moving the stack limit to a value that makes reasonable
real life scenarios work.
--*/
__attribute__((noinline,NOOPT_ATTRIBUTE))
void
EnsureStackSize(SIZE_T stackSize)
{
volatile uint8_t *s = (uint8_t *)_alloca(stackSize);
*s = 0;
}
#endif // ENSURE_PRIMARY_STACK_SIZE
Looks a lot like the issue here...
Unless something comes up suggesting that this is more of a roslyn issue, closing in favor of https://github.com/dotnet/runtime/issues/76523
Looking for guidance on how to debug this.
Version Used: 4.4.0-2.22426.8 Steps to Reproduce:
Expected Behavior: Build should work
Actual Behavior:
Also reproducible when building roslyn 4.3.0-3.22415.1 with dotnet sdk 6.0.401 (roslyn 4.3.0-3.22415.1)
Full log here