dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.62k stars 4.57k forks source link

Cross built ILCompiler NuGet contains HostOS (Linux) ELFs not TargetOS (FreeBSD) ELFs #104497

Open Thefrank opened 1 week ago

Thefrank commented 1 week ago

Overview:

When cross compiling runtime, the ILCompiler (and NuGet) that is produced contains HostOS (Linux) and not TargetOS (FreeBSD) ELFs.

Reproduction:

Using a docker container on Linux pulled from: https://raw.githubusercontent.com/dotnet/versions/master/build-info/docker/image-info.dotnet-dotnet-buildtools-prereqs-docker-main.json containing a FreeBSD ROOTFS and the most recent net9 preview tag from runtime: v9.0.0-preview.5.24306.7

docker run -e ROOTFS_DIR=/crossrootfs/$ROOTFSARCH -v ${BUILD_SOURCESDIRECTORY}/runtime:/runtime $DOTNETDOCKERCONTAINERUSED /runtime/build.sh -c ${{ parameters.buildType }} -cross -os freebsd -arch $ROOTFSARCH -ci /p:OfficialBuildId=$OFFICIALBUILDID --subset clr+mono+mono.manifests+tools+libs+host+packs

Expected behavior:

ELFs should be like other items generated for TargetOS:

$file crossgen2 
crossgen2: ELF 64-bit LSB pie executable, x86-64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 13.2, FreeBSD-style, BuildID[sha1]=54a7f1c2a4752435c2cffd15eeb959f609966907, stripped

Actual behavior:

The resulting runtime.freebsd-x64.Microsoft.DotNet.ILCompiler.9.0.0-preview.5.24306.7.nupkg contains Linux ELFs and FreeBSD libs.

$find ./ * | xargs file
./:                                           directory
./ILCompiler.RyuJit.pdb:                      Microsoft Roslyn C# debugging symbols version 1.0
./libclrjit_unix_x64_x64.so:                  ELF 64-bit LSB shared object, x86-64, version 1 (FreeBSD), dynamically linked, for FreeBSD 13.2, BuildID[sha1]=66177aebc4ab51f16fe1e6a5faa90a7ade09b674, stripped
./libclrjit_win_x86_x64.so:                   ELF 64-bit LSB shared object, x86-64, version 1 (FreeBSD), dynamically linked, for FreeBSD 13.2, BuildID[sha1]=79ecdf1053497bde0393928dee1a727bc6b6b6a1, stripped
./libclrjit_universal_arm_x64.so:             ELF 64-bit LSB shared object, x86-64, version 1 (FreeBSD), dynamically linked, for FreeBSD 13.2, BuildID[sha1]=b15d888e793cca18b6dd42b3b672f7144fbe45ec, stripped
./libjitinterface_x64.so:                     ELF 64-bit LSB shared object, x86-64, version 1 (FreeBSD), dynamically linked, for FreeBSD 13.2, BuildID[sha1]=02a2d4a17bcbd0ff35f3caca9252853f95529a3c, stripped
./ILCompiler.TypeSystem.pdb:                  Microsoft Roslyn C# debugging symbols version 1.0
./ILCompiler.DependencyAnalysisFramework.pdb: Microsoft Roslyn C# debugging symbols version 1.0
./ILCompiler.Compiler.pdb:                    Microsoft Roslyn C# debugging symbols version 1.0
./libclrjit_universal_arm64_x64.so:           ELF 64-bit LSB shared object, x86-64, version 1 (FreeBSD), dynamically linked, for FreeBSD 13.2, BuildID[sha1]=0b476dc684291af72ab673b77dccbbf7f386cbf8, stripped
./libclrjit_win_x64_x64.so:                   ELF 64-bit LSB shared object, x86-64, version 1 (FreeBSD), dynamically linked, for FreeBSD 13.2, BuildID[sha1]=a58827fd1b7ac68612408fa5e13c7db9091938a2, stripped
./ilc.pdb:                                    Microsoft Roslyn C# debugging symbols version 1.0
./ilc:                                        ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=41cb9e347020ad19b6402190528b968b6850c46f, stripped

Regression:

To the best of my knowledge, it has always been this way. This was uncaught until recently when I tried to use a cross built package to bootstrap a native build.

Known Workarounds:

None?

Other info:

TargetOS=linux only appears three places in the .binlog for the build that are after Evaluation. All three are from ILCompiler.cspoj : The first seems to come as a return from ResolveReadyToRunCompilers and the other two (_PrepareForReadyToRunCompilation) and (_CreateR2RImages) use it. NativeAotSupported is reassigned here: https://github.com/dotnet/runtime/blob/a5cc707d976a14495462c9c492a921ff0927b8f5/src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj#L17 but Crossgen2 is still used on the ilc binary and the process does not error from https://github.com/dotnet/runtime/blob/a5cc707d976a14495462c9c492a921ff0927b8f5/src/tasks/Crossgen2Tasks/ResolveReadyToRunCompilers.cs#L116-L120

There is no "Property reassignment" note in the binlog when ResolveReadyToRunCompilers changes(?) the TargetOS=freebsd to linux

.binlog is not an allowed attachment type so hopefully a screenshot from the MSBuild Structured Log Viewer is enough to help explain what I am seeing.

image

The Crossgen2 project seems to avoid the ReadyToRun part during packaging: https://github.com/dotnet/runtime/blob/a5cc707d976a14495462c9c492a921ff0927b8f5/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Crossgen2.sfxproj#L52-L53

dotnet-policy-service[bot] commented 1 week ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

jkotas commented 1 week ago

This was refactored in https://github.com/dotnet/runtime/pull/103508 . Does the problem still exist in current main?

Thefrank commented 1 week ago

@jkotas

Just checked. It still exists as of e125e93c3f42a1b90bcb831756b78fab561ab775

jkotas commented 1 week ago

Ok, here is the problem:

The regular build uses "last known good" toolchain (the one that comes from https://github.com/dotnet/runtime/blob/main/global.json#L3) for aot compilation of most managed tools and components. The one exception is crossgen2 that uses live built aot toolchain that leads to interesting problems every once in a while.

When cross-compiling from Linux to FreeBSD, there is no "last known good" aot toolchain that is able to cross-compile from Linux to FreeBSD. We end up doing Linux->Linux publish instead due to some incorrect build logic and produce binaries for wrong architecture as you have observed.

I can think about two options for addressing this:

Thefrank commented 1 week ago

Would disabling AOT for managed items still give FreeBSD (TargetOS) ELF's on components like ILCompiler? If yes, that might be the simplest solution for the time being.

The multi-stage build sounds like the best solution for this issue. But, unless there is something I am missing, this issue seems limited to a niche case of cross compiling from Linux to FreeBSD/Illumos/Haiku/Other. If it helps other build scenarios for officially supported platforms then there is a better case for this but I am not familiar enough with intricacies here.

jkotas commented 1 week ago

unless there is something I am missing, this issue seems limited to a niche case of cross compiling from Linux to FreeBSD/Illumos/Haiku/Other

I think having the recipe for how to use live-built AOT compilers would be generally useful. We keep having long discussions about whether it is better to use live-built AOT compiler or last-known-good AOT compiler. So having both options and use the one that's more appropriate for given situation would be best.

Would disabling AOT for managed items still give FreeBSD (TargetOS) ELF's on components like ILCompiler?

It would require some extra build logic. Nothing impossible.

jkotas commented 1 week ago

cc @jkoritzinsky @lambdageek

am11 commented 6 days ago

It would be nice to take https://github.com/dotnet/runtime/pull/103375 (which is currently blocked on https://github.com/dotnet/runtime/issues/104077) then tackle it in a single unified target. It is possible to decouple it from #104077, but not ideal; since cdac and future projects under src/native/managed will continue to require separate maintenance.

Thefrank commented 3 days ago

As there is no way currently to build an ILCompiler on Linux for FreeBSD is there anyway to build the ILCompiler without needing a prior version of it natively? I tried poking around the various --subset items but none of them wanted to build ILCompiler without a prior version. Am I missing something?

jkotas commented 3 days ago

Yes, that's correct. The ILCompiler is AOT-compiled using prior version ("last known good" version in my comment above). The potential ways to break this cycle is either disable AOT compilation of the ILCompiler; or do the multi-stage build thing.

am11 commented 3 days ago

Just noticed that targetInfo wasn't correctly set for FreeBSD, opened https://github.com/dotnet/sdk/pull/42144. (i.e. https://github.com/dotnet/runtime/commit/70e1072edc6c1a399f77a4de7de84045193f1409 was overwritten as "Linux" by the SDK)

Whether or not it fix this particular issue, I think it's good to let runtime control platform resolution in one place.

Thefrank commented 3 days ago

@jkotas When doing a native build, turning off AOT for ILC (and other items?) via /p:PublishAot=false /p:UseNativeAotForComponents=false allows the building of ILCompiler without a prior version. While it runs, it does not appear to create usable binaries:

  oob -> Trimming freebsd-x64 out-of-band assemblies with ILLinker...
  ##vso[task.setvariable variable=_librariesBuildProducedPackages]true
  Microsoft.DotNet.ILCompiler -> /root/runtime/artifacts/packages/Release/Shipping/Microsoft.DotNet.ILCompiler.9.0.0-preview.7.24364.99.nupkg
  Microsoft.DotNet.ILCompiler -> /root/runtime/artifacts/packages/Release/Shipping/runtime.freebsd-x64.Microsoft.DotNet.ILCompiler.9.0.0-preview.7.24364.99.nupkg
  Microsoft.NETCore.App.Ref -> 
  The package Microsoft.NETCore.App.Ref.9.0.0-preview.7.24364.99 is missing a readme. Go to https://aka.ms/nuget/authoring-best-practices/readme to learn why package readmes are important.
  Successfully created package '/root/runtime/artifacts/packages/Release/Shipping/Microsoft.NETCore.App.Ref.9.0.0-preview.7.24364.99.nupkg'.
  Successfully created package '/root/runtime/artifacts/packages/Release/Shipping/Microsoft.NETCore.App.Ref.9.0.0-preview.7.24364.99.symbols.nupkg'.
  Microsoft.NETCore.App.Host -> 
  /root/runtime/artifacts/obj/Microsoft.NETCore.App.Host/Release/net9.0/freebsd-x64/output/ -> /root/runtime/artifacts/packages/Release/Shipping//dotnet-apphost-pack-9.0.0-preview.7.24364.99-freebsd-x64.tar.gz
  The package Microsoft.NETCore.App.Host.freebsd-x64.9.0.0-preview.7.24364.99 is missing a readme. Go to https://aka.ms/nuget/authoring-best-practices/readme to learn why package readmes are important.
  Successfully created package '/root/runtime/artifacts/packages/Release/Shipping/Microsoft.NETCore.App.Host.freebsd-x64.9.0.0-preview.7.24364.99.nupkg'.
  Successfully created package '/root/runtime/artifacts/packages/Release/Shipping/Microsoft.NETCore.App.Host.freebsd-x64.9.0.0-preview.7.24364.99.symbols.nupkg'.
  crossgen2_publish -> /root/runtime/artifacts/bin/crossgen2_publish/x64/Release/crossgen2.dll
  Generating native code
  Segmentation fault (core dumped)
/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/build/Microsoft.NETCore.Native.targets(309,5): error MSB3073: The command ""/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/ilc-published/ilc" @"/root/runtime/artifacts/obj/coreclr/crossgen2_publish/freebsd.x64.Release/native/crossgen2.ilc.rsp"" exited with code 139. [/root/runtime/src/coreclr/tools/aot/crossgen2/crossgen2_publish.csproj]
##vso[task.logissue type=error;sourcepath=/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/build/Microsoft.NETCore.Native.targets;linenumber=309;columnnumber=5;code=MSB3073;](NETCORE_ENGINEERING_TELEMETRY=Build) The command ""/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/ilc-published/ilc" @"/root/runtime/artifacts/obj/coreclr/crossgen2_publish/freebsd.x64.Release/native/crossgen2.ilc.rsp"" exited with code 139.

Build FAILED.

/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/build/Microsoft.NETCore.Native.targets(309,5): error MSB3073: The command ""/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/ilc-published/ilc" @"/root/runtime/artifacts/obj/coreclr/crossgen2_publish/freebsd.x64.Release/native/crossgen2.ilc.rsp"" exited with code 139. [/root/runtime/src/coreclr/tools/aot/crossgen2/crossgen2_publish.csproj]
    0 Warning(s)
    1 Error(s)

Time Elapsed 00:24:23.00
Build failed with exit code 1. Check errors above.

The segmentation fault is SIGSEGV: address not mapped to object (fault address: 0x0). This occurs without parameters too.

I will create another Issue for when I figure out what is going on.

jkotas commented 3 days ago

/root/runtime/artifacts/bin/coreclr/freebsd.x64.Release/ilc-published/ilc is published as PublishSingleFile. My guess is that PublishSingleFile is broken on FreeBSD. If it is the case, you should be able to reproduce this crash by publishing a hello world console app with PublishSingleFile too.

You can try turning off PublishSingleFile / PublishReadyToRun / PublishTrimmed in https://github.com/dotnet/runtime/blob/main/src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj and see whether it fixes the crash.

Thefrank commented 3 days ago

A simple app builds and runs OK. Steps if I missed something:

./dotnet --version
9.0.100-preview.5.24307.3
---
./dotnet new console -o /root/test
The template "Console App" was created successfully.

Processing post-creation actions...
Restoring /root/test/test.csproj:
Restore succeeded.
 ---
./dotnet publish /root/test/test.csproj /p:PublishSingleFile=true /p:PublishReadyToRun=true /p:PublishTrimmed=true
Restore complete (4.0s)
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  test succeeded (21.4s) → /root/test/bin/Release/net9.0/freebsd-x64/publish/

Build succeeded in 27.7s
---
ls /root/test/bin/Release/net9.0/freebsd-x64/publish
test            test.pdb
---
/root/test/bin/Release/net9.0/freebsd-x64/publish/test 
Hello, World!

same results with combinations of /p:PublishSingleFile=true /p:PublishReadyToRun and /p:PublishTrimmed=true. No issues.

However /p:PublishSingleFile=true /p:PublishAOT=true builds fine but will crash:

./dotnet publish -c Release /root/test/test.csproj /p:PublishSingleFile=true /p:PublishAOT=true
Restore complete (6.8s)
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  test succeeded (11.1s) → /root/test/bin/Release/net9.0/freebsd-x64/publish/

Build succeeded in 20.4s
/root/test/bin/Release/net9.0/freebsd-x64/publish/test
Segmentation fault (core dumped)

/p:PublishAOT=true alone will also build without issue but crash when run

am11 commented 3 days ago

PublishSingleFile alone and PublishReadyToRun alone also run? Note that you will need to delete obj/ dir in between or pass -t:Rebuild (changing command-line args alone doesn't evict msbuild's cache).

Thefrank commented 3 days ago

Yes

./dotnet publish -c Release /root/test/test.csproj /p:PublishSingleFile=true -t:Rebuild && /root/test/bin/Release/net9.0/freebsd-x64/publish/test
Restore complete (2.1s)
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  test succeeded (4.3s) → /root/test/bin/Release/net9.0/freebsd-x64/publish/

Build succeeded in 9.5s
Hello, World!
./dotnet publish -c Release /root/test/test.csproj /p:PublishReadyToRun=true -t:Rebuild && /root/test/bin/Release/net9.0/freebsd-x64/publish/test
Restore complete (2.5s)
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  test succeeded (6.6s) → /root/test/bin/Release/net9.0/freebsd-x64/publish/

Build succeeded in 11.5s
Hello, World!

PublishReadyToRun required me adding the SDK location to /etc/dotnet/install_location

am11 commented 1 day ago

The issue is we are trying to restore linux-x64 apphost to PublishSingleFile. https://github.com/dotnet/runtime/pull/105004 switches PublishSingleFile to use the live host. @Thefrank could you give it a try?

Thefrank commented 1 day ago

@am11 yep! That allows crossbuild on Linux to output a FreeBSD ELF for ILCompiler. I will open another issue to address the output from ILC