dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.28k stars 4.73k forks source link

CoreCLR doesn't boot on a "modern" Linux/ARM image #10728

Closed Tragetaschen closed 5 months ago

Tragetaschen commented 6 years ago

I'm able to compile .NET Core 2.1 for my Linux distribution based on Yocto for ARM. If I'm doing this for the "pyro" release from spring 2017, I can run the simple Hello World both when using the dotnet executable I've compiled with the just the dll as well as a self-contained publish for linux-arm coming from Windows.

With Yocto's "sumo" release from spring 2018, both methods of running the application hang indefinitely and don't printing anything. CTRL+C doesn't do anything and I have to kill -KILL from another shell.

As can be expected, there are a couple of updates to the libraries included in the distribution. I'm listing the direct dependencies I've declared for my build in the following table

Library pyro sumo
clang 4.0.1 6.0.1
cmake 3.7.2 3.10.3
glibc 2.25 2.27
libunwind 1.1 1.2.1
icu 58.2 60.2
openssl 1.0.2n 1.0.2o
util-linux 2.29.1 2.31
lttng-ust 2.9.0 2.10.1
krb5 1.15.1 1.16
curl 7.53.1 7.58.0

I'm currently creating a Debug build and try to work with (against?) lldb to get some insight. Running with gdb also hangs and cannot be interrupted for a stack trace or the likes.

Any ideas?

jkotas commented 6 years ago

You can compare strace for success/failure. It will tell you the approximate place where things go wrong.

Tragetaschen commented 6 years ago

I've put the two strace logs into a gist:

https://gist.github.com/Tragetaschen/dfcf4e5243f40bbab2bc1f74d9324dba

Tragetaschen commented 6 years ago

The sumo strace log is a Debug build, the pyro log Release

janvorli commented 6 years ago

@Tragetaschen I've looked at your logs and the log from "sumo" strangely ends in the middle of scanning the /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.2 for available .dll files. Most likely, it is in the middle of this loop: https://github.com/dotnet/core-setup/blob/b17186ff80b6f58c15ee7dd474a719697c8dea10/src/corehost/common/pal.unix.cpp#L546-L618 This is inside of the "dotnet" app, way before even the CoreCLR runtime is loaded. I would recommend building debug version of the dotnet tool and step through the code mentioned above to see where it hangs.

Tragetaschen commented 5 years ago

I've retried with (ASP).NET Core 2.2 and it still hangs.

However, I'm now pretty sure that the above gist is nonsense as it doesn't contain everything tracked by strace. The following one shows everything including the final futex systemcall where things hang:

trace.txt

Tragetaschen commented 5 years ago

I've narrowed it down to liblttng-ust being accessible on the target. When I move away the actual .so.0.0.0, CoreCLR boots successfully and doesn't hang. According to the successful strace, liblttng-ust-tracepoint still loads and works fine.

The Yocto build system sanity-checks the resulting binaries and notified that I hadn't put lttng-ust into the dependencies. So I naturally included them.

Tragetaschen commented 5 years ago

Alternatively, I can use an empty libcoreclrtraceptprovider.so. Deleting it is recognized when checking the manifest.

hgroover1 commented 5 years ago

I can confirm that I've experienced this (and the temporary workaround of renaming liblttng-ust.so.0.0.0 worked) using a Yocto rocko build of a kernel.org 4.14.77 with (many) additions from Digi-Linux 4.9 for the IMX6 camera support.

Tragetaschen commented 5 years ago

Today I found some time to test this again systematically with Yocto "warrior" and .NET Core 3. My image has a working lttng-ust version 2.10.3 and I left libcoreclrtraceprovider.so untouched.

My self-contained application starts successfully and doesn't lock up. Can someone else confirm this as well?

I tried to collect some perf traces, but immediately ran into dotnet/coreclr#26883.

franksinankaya commented 5 years ago

Nothing related to lttng issue, but I wanted to highlight that you can build coreclr with a yocto SDK instead of part of the build.

Here is what I did for ARM32:

. /opt/poky/2.6.2/environment-setup-armv5e-poky-linux-gnueabi export PATH=/usr/bin:$PATH export ROOTFS_DIR=$PKG_CONFIG_SYSROOT_DIR export TOOLCHAIN=arm-poky-linux-gnueabi

export CLR_CC=$CC export CLR_CXX=$CXX export CLR_AR=$AR export CLR_NM=$NM export CLR_LINK=$CC export CLR_OBJDUMP=$OBJDUMP export CLR_OBJCOPY=$OBJCOPY export CLR_RANLIB=$RANLIB

./build.sh -gcc -cross -arm -ignorewarnings -skipcrossarchnative -skipmanaged

and ARM64 . /opt/poky/2.6.2/environment-setup-aarch64-poky-linux-gnueabi

export PATH=/usr/bin:$PATH export ROOTFS_DIR=$PKG_CONFIG_SYSROOT_DIR export TOOLCHAIN=aarch64-poky-linux-gnueabi

export CLR_CC=$CC export CLR_CXX=$CXX export CLR_AR=$AR export CLR_NM=$NM export CLR_LINK=$CC export CLR_OBJDUMP=$OBJDUMP export CLR_OBJCOPY=$OBJCOPY export CLR_RANLIB=$RANLIB

./build.sh -gcc -cross -arm64 -ignorewarnings -skipcrossarchnative -skipmanaged

two PRs pending: https://github.com/dotnet/coreclr/pull/27796 https://github.com/dotnet/coreclr/pull/27795

dotnet-policy-service[bot] commented 6 months ago

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of our issue cleanup automation.

dotnet-policy-service[bot] commented 5 months ago

This issue will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.