dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.03k stars 4.68k forks source link

Live apphost not being used by crossgen2 with source build #77667

Closed ayakael closed 1 year ago

ayakael commented 1 year ago

Description

.NET 7 in source-build does not currently use NativeAOT. Thus, when building Crossgen2 from source on platforms without apphost in nuget feeds, build of runtime fails with:

/var/build/dotnet7/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100/Sdks/Microsoft.NET.Sdk/targets/Microsoft.NET.Sdk.FrameworkReferenceResolution.targets(135,5): error NETSDK1084: There is no application host available for the specified RuntimeIdentifier 'linux-musl-x86'. [/var/build/dotnet7/testing/dotnet7-stage0/src/dotnet-e6dd91c290b808f971a1ac69c2fb29395bbf1051/src/runtime/src/coreclr/tools/aot/crossgen2/crossgen2.csproj]

Reproduction Steps

Crossbuild from x64 to x86, crossgen2 build will fail.

Expected behavior

Crossgen2 should be able to use to live apphost built previously by runtime

Actual behavior

Build fails due to no apphost for that RID

Regression?

Per https://github.com/dotnet/runtime/issues/7335 and https://github.com/dotnet/runtime/issues/66866, there was a time when this was a non-issue. With move to NativeAOT, UseLiveBuiltDotNetHost was dropped, and this use-case seems to have been removed.

Known Workarounds

None for now. I attempted to build runtime in two steps, to no avail:

ROOTFS_DIR="$CBUILDROOT" ./build.sh $args -subset Clr.Native+Host.Native
ROOTFS_DIR="$CBUILDROOT" ./build.sh $args /p:AppHostSourcePath="$builddir"/src/runtime/artifacts/obj/linux-musl-$_dotnet_target.Release/apphost/standalone/apphost

Configuration

.NET 7.0.100-rtm.22519.39 Alpine Linux Edge x64 to x86 crossbuild

Other information

No response

dotnet-issue-labeler[bot] commented 1 year ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

am11 commented 1 year ago

@agocke might be able to help but AFAIK, these days crossgen2 uses live build of ilc (NativeAOT) when available or fallback to live build of corehost on platforms where NativeAOT is not available. The use of prebuilt host (from nuget package) was completely removed. Looks like you are seeing a different behavior on x86; something is still expecting prebuilt host. Perhaps we are (unintentionally) skipping live build host on x86?

Could you try git clean -xdf && ./build.sh clr+host+libs+packs just to be sure?

ayakael commented 1 year ago

Building just those subsets gives the same result.

am11 commented 1 year ago

In https://github.com/dotnet/runtime/commit/9a3626e16efb4fb1ae75d806bf66f0769bd842c1, MicrosoftNETCoreDotNetHostVersion etc. were removed so no prebuilt apphost get restored. crossgen2 is published as a single-file using the live corehost build (when NativeAOT is not supported).

Maybe this patch will help:

--- a/src/coreclr/tools/aot/crossgen2/crossgen2.csproj
+++ b/src/coreclr/tools/aot/crossgen2/crossgen2.csproj
@@ -9,7 +9,7 @@
     <!-- Trimming is not currently working, but set the appropriate feature flags for NativeAOT -->
     <PublishTrimmed Condition="'$(NativeAotSupported)' == 'true'">true</PublishTrimmed>
     <RuntimeIdentifiers Condition="'$(RunningPublish)' != 'true' and '$(DotNetBuildFromSource)' != 'true'">linux-x64;linux-musl-x64;linux-arm;linux-musl-arm;linux-arm64;linux-musl-arm64;freebsd-x64;osx-x64;osx-arm64;win-x64;win-x86;win-arm64;win-arm</RuntimeIdentifiers>
-    <RuntimeIdentifiers Condition="'$(DotNetBuildFromSource)' == 'true'">$(PackageRID)</RuntimeIdentifiers>
+    <RuntimeIdentifiers Condition="'$(RunningPublish)' != 'true' and '$(DotNetBuildFromSource)' == 'true'">$(PackageRID)</RuntimeIdentifiers>
     <SelfContained>false</SelfContained>
     <SelfContained Condition="'$(RunningPublish)' == 'true'">true</SelfContained>
   </PropertyGroup>
agocke commented 1 year ago

This seems like it's specific to linux x86, right? I don't see any source build problems for other architectures.

ghost commented 1 year ago

Tagging subscribers to this area: @hoyosjs See info in area-owners.md if you want to be subscribed.

Issue Details
### Description .NET 7 in source-build does not currently use NativeAOT. Thus, when building Crossgen2 from source on platforms without apphost in nuget feeds, build of runtime fails with: ``` /var/build/dotnet7/testing/dotnet7-stage0/src/bootstrap/sdk/7.0.100/Sdks/Microsoft.NET.Sdk/targets/Microsoft.NET.Sdk.FrameworkReferenceResolution.targets(135,5): error NETSDK1084: There is no application host available for the specified RuntimeIdentifier 'linux-musl-x86'. [/var/build/dotnet7/testing/dotnet7-stage0/src/dotnet-e6dd91c290b808f971a1ac69c2fb29395bbf1051/src/runtime/src/coreclr/tools/aot/crossgen2/crossgen2.csproj] ``` ### Reproduction Steps Crossbuild from `x64` to `x86`, crossgen2 build will fail. ### Expected behavior Crossgen2 should be able to use to live apphost built previously by runtime ### Actual behavior Build fails due to no apphost for that RID ### Regression? Per https://github.com/dotnet/runtime/issues/7335 and https://github.com/dotnet/runtime/issues/66866, there was a time when this was a non-issue. With move to NativeAOT, `UseLiveBuiltDotNetHost` was dropped, and this use-case seems to have been removed. ### Known Workarounds None for now. I attempted to build runtime in two steps, to no avail: ``` ROOTFS_DIR="$CBUILDROOT" ./build.sh $args -subset Clr.Native+Host.Native ROOTFS_DIR="$CBUILDROOT" ./build.sh $args /p:AppHostSourcePath="$builddir"/src/runtime/artifacts/obj/linux-musl-$_dotnet_target.Release/apphost/standalone/apphost ``` ### Configuration .NET 7.0.100-rtm.22519.39 Alpine Linux Edge x64 to x86 crossbuild ### Other information _No response_
Author: ayakael
Assignees: -
Labels: `area-Infrastructure-coreclr`
Milestone: Future
ayakael commented 1 year ago

This seems like it's specific to linux x86, right? I don't see any source build problems for other architectures.

Indeed. I believe this is specific to x86 as crossgen2 isn't built on other community platforms (like ppc64le and s390x) as they use mono.

In 9a3626e, MicrosoftNETCoreDotNetHostVersion etc. were removed so no prebuilt apphost get restored. crossgen2 is published as a single-file using the live corehost build (when NativeAOT is not supported).

Maybe this patch will help:

--- a/src/coreclr/tools/aot/crossgen2/crossgen2.csproj
+++ b/src/coreclr/tools/aot/crossgen2/crossgen2.csproj
@@ -9,7 +9,7 @@
     <!-- Trimming is not currently working, but set the appropriate feature flags for NativeAOT -->
     <PublishTrimmed Condition="'$(NativeAotSupported)' == 'true'">true</PublishTrimmed>
     <RuntimeIdentifiers Condition="'$(RunningPublish)' != 'true' and '$(DotNetBuildFromSource)' != 'true'">linux-x64;linux-musl-x64;linux-arm;linux-musl-arm;linux-arm64;linux-musl-arm64;freebsd-x64;osx-x64;osx-arm64;win-x64;win-x86;win-arm64;win-arm</RuntimeIdentifiers>
-    <RuntimeIdentifiers Condition="'$(DotNetBuildFromSource)' == 'true'">$(PackageRID)</RuntimeIdentifiers>
+    <RuntimeIdentifiers Condition="'$(RunningPublish)' != 'true' and '$(DotNetBuildFromSource)' == 'true'">$(PackageRID)</RuntimeIdentifiers>
     <SelfContained>false</SelfContained>
     <SelfContained Condition="'$(RunningPublish)' == 'true'">true</SelfContained>
   </PropertyGroup>

Unfortunately that patch didn't work. Very curious that it insists on the apphost not existing despite the apphost being built.

ayakael commented 1 year ago

Is there a subset for only building and packaging apphost? I'm thinking a workaround might be to build apphost, and push it to a local nuget feed via dotnet nuget push, that is then referenced by NuGet.conf for it to be pulled full the full build. -subset Clr.Native+Host.Native builds Apphost but does not package it.

ayakael commented 1 year ago

sigh -subset Host packages apphost, but despite that crossgen2 refuses to pull runtime.linux-musl-x86.Microsoft.NETCore.DotNetAppHost.7.0.0-rtm.22511.4.nupkg from the local NuGet feed.

am11 commented 1 year ago

I cannot test linux-musl-x86 atm as we don't have a prereq docker image handy (and build-rootfs is lacking its support too).

Does source-build fail for linux-x86 too, as it does for linux-musl-x86? asking because the regular cross build for linux-x86 has no issue:

$ git clone https://github.com/dotnet/runtime -b release/7.0 --single-branch --depth 1

$ docker run -v$(pwd)/runtime:/runtime -e ROOTFS_DIR=/crossrootfs/x86 --rm \
    mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-cross-x86-linux-20221024215143-3fc5553 \
    /runtime/build.sh clr+libs+host+packs -arch x86 -cross

__DistroRid: linux-x86
... a few minutes later
Build succeeded.
ayakael commented 1 year ago

I cannot test linux-musl-x86 atm as we don't have a prereq docker image handy (and build-rootfs is lacking its support too).

Does source-build fail for linux-x86 too, as it does for linux-musl-x86? asking because the regular cross build for linux-x86 has no issue:

$ git clone https://github.com/dotnet/runtime -b release/7.0 --single-branch --depth 1

$ docker run -v$(pwd)/runtime:/runtime -e ROOTFS_DIR=/crossrootfs/x86 --rm \
    mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-cross-x86-linux-20221024215143-3fc5553 \
    /runtime/build.sh clr+libs+host+packs -arch x86 -cross

__DistroRid: linux-x86
... a few minutes later
Build succeeded.

I could try to figure out a linux-x86 environment. I suspect it worked for you as you were not building with DotNetBuildFromSource=true, thus crossgen would not look for apphost given build with NativeAOT. The reason we build with that flag on Alpine is to weed out issues that would likely appear in source-build (like this one). I'm trying with DotNetBuildFromSource=false on linux-musl-x86 to confirm this.

am11 commented 1 year ago

thus crossgen would not look for apphost given build with NativeAOT

NativeAOT doesn't support linux-x86, it builds with apphost (singlefilehost to be specific) when NativeAOT does not support the target platform.

ayakael commented 1 year ago
clr+libs+host+packs

Interesting, I didn't know that x86 uses singlefilehost. Indeed, building with DotNetBuildFromSource=false yields a successful build on linux-musl-x86. This confirms this as being principally a source-build issue.

ayakael commented 1 year ago

I workaround for now, I figure, is to build a Mono flavored runtime as to avoid building crossgen. Seems kind of like a big roundabout for what seems like a bug that's likely due to a misplaced if statement somewhere. Alas, I also want to check if Mono will hit RAM limits seen by other people when trying to build x86 from source.

ayakael commented 1 year ago

Forcing build of Mono-flavored runtime fails with:

  [100%] Linking C shared library libcoreclr.so
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libssp_nonshared.a when searching for -lssp_nonshared
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libc.so when searching for -lc
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libc.a when searching for -lc
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  [100%] Built target monosgen-static
  [100%] Building C object mono/mini/CMakeFiles/mono-sgen.dir/main.c.o
  [100%] Linking C executable mono-sgen
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libssp_nonshared.a when searching for -lssp_nonshared
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  [100%] Built target monosgen-shared
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libc.so when searching for -lc
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libc.a when searching for -lc
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  /usr/bin/ld: skipping incompatible /var/build/sysroot-x86//usr/lib/libgcc_s.so.1 when searching for libgcc_s.so.1
  /usr/bin/ld: libmonosgen-2.0.a(pal_icushim.c.o): in function `GlobalizationNative_LoadICU':
  pal_icushim.c:(.text+0x8f): undefined reference to `__dlsym_time64'
  /usr/bin/ld: pal_icushim.c:(.text+0xc9): undefined reference to `__dlsym_time64'
  /usr/bin/ld: pal_icushim.c:(.text+0x103): undefined reference to `__dlsym_time64'
  /usr/bin/ld: pal_icushim.c:(.text+0x13d): undefined reference to `__dlsym_time64'
  /usr/bin/ld: pal_icushim.c:(.text+0x177): undefined reference to `__dlsym_time64'
  /usr/bin/ld: libmonosgen-2.0.a(pal_icushim.c.o):pal_icushim.c:(.text+0x1b1): more undefined references to `__dlsym_time64' follow
  clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
  make[2]: *** [mono/mini/CMakeFiles/mono-sgen.dir/build.make:194: mono/mini/mono-sgen] Error 1
  make[1]: *** [CMakeFiles/Makefile2:660: mono/mini/CMakeFiles/mono-sgen.dir/all] Error 2
  make: *** [Makefile:136: all] Error 2

Likely a toolchain.cmake issue, as we had to force a sysroot-based crossbuild as Alpine doesn't support -m32compilation mode very well.

ayakael commented 1 year ago

This doesn't seem to be an issue anymore. I've crossbuilt version 7.0.102 to 8.0.100-preview.1 without having to deal with this.

I do have an issue with the build artifacts, though. I'm getting an execution failure when trying to run the resulting SDK. Running dotnet --info with the following SDK in linux-musl-x86 chroot yields a freeze, and then eventually a Failed to create CoreCLR, HRESULT: 0x8007000E. It seems like CoreCLR tries to allocate all of the available memory for some reason. @am11 I see that you got the build-rootfs.sh script working for linux-musl-x86, have you managed a successful crossbuild to linux-musl-x86?

I tried stracing it, but could only attach the process after execution. strace.txt

Edit: The following patchset is used to build the SDK: https://lab.ilot.io/ayakael/dotnet-stage0

am11 commented 1 year ago

@ayakael, I have only tried building runtime, didn't test the binaries.

Running dotnet --info with the following SDK

I don't have permission to download it. :)

Failed to create CoreCLR, HRESULT: 0x8007000E

BTW, were you testing under VM / baremetal install or QEMU? CoreCLR-PAL has some issues with QEMU (even on 64-bit archs).

ayakael commented 1 year ago

Woops, here is a functional link: dotnet-sdk-8.0.100-preview.2.23157.25-r1-linux-musl-x86.tar.xz

I tested first within a chroot of alpine-x86 that was in LXC, and then within a chroot in a qemu based env. Both behave the same.

am11 commented 1 year ago

Opened #83509 for the different issue.

This doesn't seem to be an issue anymore. I've crossbuilt version 7.0.102 to 8.0.100-preview.1 without having to deal with this.

Good to know that this issue no longer repros. Closing.