dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

NativeAOT + classlib + macOS host + linux-bionic-arm64 target: doesn't work out of the box / inscrutable errors #101727

Closed jonpryor closed 5 months ago

jonpryor commented 6 months ago

Description

Trying to use NativeAOT to build a linux-bionic-arm64 shared library from macOS does not work out of the box.

Reproduction Steps

% dotnet new classlib -n HelloBionicSharedLib
% cd HelloBionicSharedLib
% dotnet publish -c Release -r linux-bionic-arm64 -p:PublishAotUsingRuntimePack=true -p:PublishAot=true

Expected behavior

Above dotnet publish should produce a .so file.

Actual behavior

…
clang : error : invalid linker name in argument '-fuse-ld=lld'
$HOME/.nuget/packages/microsoft.dotnet.ilcompiler/8.0.4/build/Microsoft.NETCore.Native.Unix.targets(236,5): error :
Symbol stripping tool ('llvm-objcopy' or 'objcopy') not found in PATH. Try installing appropriate package for llvm-objcopy or objcopy to resolve the problem or set the StripSymbols property to false to disable symbol stripping.

Yes, neither llvm-objcopy nor objcopy are in $PATH, but the first line "hides" the second; invalid linker name in argument '-fuse-ld=lld' looks like the "root" error, and doesn't really make sense by itself.

The second line is actually explanatory, if you get past reading the first. (Yes, reading comprehension is hard. Regardless, error messages should be actionable, and the first line is not actionable.)

If we fix that up:

% export PATH=$ANDROID_NDK_HOME/toolchains/llvm/prebuilt/darwin-x86_64/bin:$PATH

…then it still fails to build:

% dotnet publish -c Release -r linux-bionic-arm64 -p:PublishAotUsingRuntimePack=true -p:PublishAot=true … ld.lld : error : version script assignment of 'V1.0' to symbol '_init' failed: symbol not defined ld.lld : error : version script assignment of 'V1.0' to symbol '_fini' failed: symbol not defined clang-17 : error : linker command failed with exit code 1 (use -v to see invocation) $HOME/.nuget/packages/microsoft.dotnet.ilcompiler/8.0.4/build/Microsoft.NETCore.Native.targets(367,5): error MSB3073: The command ""clang" "obj/Release/net8.0/linux-bionic-arm64/native/HelloBionicSharedLib.o" -o "bin/Release/net8.0/linux-bionic-arm64/native/HelloBionicSharedLib.so" -Wl,--version-script=obj/Release/net8.0/linux-bionic-arm64/native/HelloBionicSharedLib.exports -Wl,--export-dynamic -gz=zlib -fuse-ld=lld $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libbootstrapperdll.o $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libRuntime.WorkstationGC.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libeventpipe-disabled.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libstdc++compat.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Globalization.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.IO.Compression.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Security.Cryptography.Native.OpenSsl.a --target=aarch64-linux-android21 -g -Wl,-rpath,'$ORIGIN' -Wl,--build-id=sha1 -Wl,--as-needed -Wl,-e0x0 -pthread -ldl -lz -llog -lm -shared -Wl,-z,relro -Wl,-z,now -Wl,--eh-frame-hdr -Wl,--discard-all -Wl,--gc-sections -Wl,-T,"obj/Release/net8.0/linux-bionic-arm64/native/sections.ld"" exited with code 1.

https://github.com/dotnet/runtime/issues/92272#issuecomment-1732352537 suggests adding the following snippet to the .csproj:

<ItemGroup Condition="'$(RuntimeIdentifier)' == 'linux-bionic'">
  <LinkerArg Include="-Wl,--defsym,_init=__libc_init" />
  <LinkerArg Include="-Wl,--defsym,_fini=__libc_fini" />
</ItemGroup

This doesn't work because the $(RuntimeIdentifier) comparison is wrong, but if we fix that:

  <ItemGroup Condition="$(RuntimeIdentifier.StartsWith('linux-bionic'))">
    <LinkerArg Include="-Wl,--defsym,_init=__libc_init" />
    <LinkerArg Include="-Wl,--defsym,_fini=__libc_fini" />
  </ItemGroup>

…then it still fails to build:

ld.lld : error : --defsym:1: symbol not found: libc_fini ld.lld : error : --defsym:1: symbol not found: __libc_init ld.lld : error : --defsym:1: symbol not found: libc_fini ld.lld : error : --defsym:1: symbol not found: __libc_init ld.lld : error : --defsym:1: symbol not found: libc_fini clang-17 : error : linker command failed with exit code 1 (use -v to see invocation) $HOME/.nuget/packages/microsoft.dotnet.ilcompiler/8.0.4/build/Microsoft.NETCore.Native.targets(367,5): error MSB3073: The command ""clang" "obj/Release/net8.0/linux-bionic-arm64/native/HelloBionicSharedLib.o" -o "bin/Release/net8.0/linux-bionic-arm64/native/HelloBionicSharedLib.so" -Wl,--version-script=obj/Release/net8.0/linux-bionic-arm64/native/HelloBionicSharedLib.exports -Wl,--export-dynamic -Wl,--defsym,_init=libc_init -Wl,--defsym,_fini=__libc_fini -gz=zlib -fuse-ld=lld $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libbootstrapperdll.o $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libRuntime.WorkstationGC.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libeventpipe-disabled.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libstdc++compat.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Globalization.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.IO.Compression.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Security.Cryptography.Native.OpenSsl.a --target=aarch64-linux-android21 -g -Wl,-rpath,'$ORIGIN' -Wl,--build-id=sha1 -Wl,--as-needed -Wl,-e0x0 -pthread -ldl -lz -llog -lm -shared -Wl,-z,relro -Wl,-z,now -Wl,--eh-frame-hdr -Wl,--discard-all -Wl,--gc-sections -Wl,-T,"obj/Release/net8.0/linux-bionic-arm64/native/sections.ld"" exited with code 1.

Regression?

No.

Known Workarounds

A cause of the original ld.lld : error : version script assignment of 'V1.0' to symbol '_init' failed: symbol not defined is that none of the .a files used in the link command provide that symbol. Perhaps one of them should?

In the meantime, android-bionic.md has a Known issues section which suggests adding a -Wl,--undefined-version linker argument.

Update HelloBionicSharedLib.csproj to end with:

  <Import Project="HelloBionicSharedLib.targets" />

then add HelloBionicSharedLib.targets to the HelloBionicSharedLib directory with contents:

<Project>
  <ItemGroup Condition="$(RuntimeIdentifier.StartsWith('linux-bionic'))">
    <LinkerArg Include="-Wl,--undefined-version" />
  </ItemGroup>
</Project>

and the dotnet publish command now succeeds without error, producing a bin/Release/net8.0/linux-bionic-arm64/publish/HelloBionicSharedLib.so artifact.

Configuration

.NET Configuration:

I have no idea if it's specific to this configuration.

Other information

No response

dotnet-policy-service[bot] commented 6 months ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

jonpryor commented 6 months ago

Related: attempting to follow the NativeAOT+Bionic instructions at https://github.com/dotnet/runtime/blob/017593d90781d797df8b2241f6d1f83c236c442b/src/coreclr/nativeaot/docs/android-bionic.md from macOS + x64 results in similar errors described above:

% dotnet new console -o HelloBionic --aot
% cd HelloBionic
# the xamarin-android NDK install directory; update as appropriate…
% export PATH=$HOME/android-toolchain/ndk/toolchains/llvm/prebuilt/darwin-x86_64/bin:$PATH
% dotnet publish -r linux-bionic-arm64 -p:DisableUnsupportedError=true -p:PublishAotUsingRuntimePack=true

Results in the errors:

ld.lld : error : version script assignment of 'V1.0' to symbol '_init' failed: symbol not defined ld.lld : error : version script assignment of 'V1.0' to symbol '_fini' failed: symbol not defined clang-17 : error : linker command failed with exit code 1 (use -v to see invocation) $HOME/.nuget/packages/microsoft.dotnet.ilcompiler/8.0.4/build/Microsoft.NETCore.Native.targets(367,5): error MSB3073: The command ""clang" "obj/Release/net8.0/linux-bionic-arm64/native/HelloBionic.o" -o "bin/Release/net8.0/linux-bionic-arm64/native/HelloBionic" -Wl,--version-script=obj/Release/net8.0/linux-bionic-arm64/native/HelloBionic.exports -Wl,--export-dynamic -gz=zlib -fuse-ld=lld $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libbootstrapper.o $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libRuntime.WorkstationGC.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libeventpipe-disabled.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libstdc++compat.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.IO.Compression.Native.a $HOME/.nuget/packages/microsoft.netcore.app.runtime.nativeaot.linux-bionic-arm64/8.0.4/runtimes/linux-bionic-arm64/native/libSystem.Security.Cryptography.Native.OpenSsl.a --target=aarch64-linux-android21 -g -Wl,-rpath,'$ORIGIN' -Wl,--build-id=sha1 -Wl,--as-needed -pthread -ldl -lz -llog -lm -pie -Wl,-pie -Wl,-z,relro -Wl,-z,now -Wl,--eh-frame-hdr -Wl,--discard-all -Wl,--gc-sections -Wl,-T,"obj/Release/net8.0/linux-bionic-arm64/native/sections.ld"" exited with code 1.

jonpryor commented 6 months ago

(Speaking of reading comprehension…)

android-bionic.md has a Known issues section section, which has a much better & simpler fix:

<ItemGroup Condition="$(RuntimeIdentifier.StartsWith('linux-bionic'))">
  <LinkerArg Include="-Wl,--undefined-version" />
</ItemGroup>
jonpryor commented 6 months ago

I've updated the original report to take my previous comment into consideration.

MichalStrehovsky commented 6 months ago

Yes, neither llvm-objcopy nor objcopy are in $PATH, but the first line "hides" the second; invalid linker name in argument '-fuse-ld=lld' looks like the "root" error, and doesn't really make sense by itself.

What's your build logging verbosity? I believe the error message you're seeing comes from this line, since that's the line that precedes the objcopy checks:

https://github.com/dotnet/runtime/blob/71f8fb65a5a28018901823c19de57fe9451ab3b1/src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.Unix.targets#L288-L291

clang potentially erroring out is expected and handled, the logic just wants to know if it will fail. The output is logged with Low output importance, so normally shouldn't be visible. Or can you attach a binlog (pass -bl to the publish line and grab the resulting .binlog in the current directory) so we know for sure where it's coming from?

jonpryor commented 6 months ago

@MichalStrehovsky asked:

What's your build logging verbosity?

"Default"; no verbosity specified. All commands were specified in the original comment:

% dotnet new classlib -n HelloBionicSharedLib
% cd HelloBionicSharedLib
% dotnet publish -c Release -r linux-bionic-arm64 -p:PublishAotUsingRuntimePack=true -p:PublishAot=true

can you attach a binlog

Attached.

msbuild.binlog.zip

The output is logged with Low output importance

That's not quite true; this line: https://github.com/dotnet/runtime/blob/9b088ab8287a77c52ff7c4ed6fa96be6d3eb87f1/src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.Unix.targets#L214

sets Exec.StandardOutputImportance, which controls the importance of stdout.

The "offending" clang : error : invalid linker name in argument '-fuse-ld=lld' message is written to stderr.

Additionally, it contains the string "error", which means that the default "error warning format" will detect this as an error.

Locally, if I set Exec.IgnoreStandardErrorWarningFormat=true, then the clang: error: … line becomes a non-error message. However, it's still printed:

# with local patch to add IgnoreStandardErrorWarningFormat=true:
% dotnet publish -c Release -r linux-bionic-arm64 -p:PublishAotUsingRuntimePack=true -p:PublishAot=true
MSBuild version 17.9.8+b34f75857 for .NET
  Determining projects to restore...
  All projects are up-to-date for restore.
  HelloBionicSharedLib -> /private/tmp/HelloBionicSharedLib/bin/Release/net8.0/linux-bionic-arm64/HelloBionicSharedLib.dll
  clang: error: invalid linker name in argument '-fuse-ld=lld'
/Users/jon/.nuget/packages/microsoft.dotnet.ilcompiler/8.0.4/build/Microsoft.NETCore.Native.Unix.targets(236,5): error : Symbol stripping tool ('llvm-objcopy' or 'objcopy') not found in PATH. Try installing appropriate package for llvm-objcopy or objcopy to resolve the problem or set the StripSymbols property to false to disable symbol stripping. [/private/tmp/HelloBionicSharedLib/HelloBionicSharedLib.csproj]

If the idea is to entirely remove clang: error: … from the default output verbosity, then you also need to set StandardErrorImportance="Low", a'la this patch:

diff --git a/src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.Unix.targets b/src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.Unix.targets
index 7441f7da0f6..575caae39ae 100644
--- a/src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.Unix.targets
+++ b/src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.Unix.targets
@@ -285,7 +285,7 @@ The .NET Foundation licenses this file to you under the MIT license.
     <Error Condition="'$(_WhereLinker)' != '0' and '$(CppCompilerAndLinkerAlternative)' == '' and '$(_IsApplePlatform)' != 'true'"
       Text="Requested linker ('$(CppLinker)') not found in PATH." />

-    <Exec Command="&quot;$(CppLinker)&quot; -fuse-ld=lld -Wl,--version" Condition="'$(LinkerFlavor)' == 'lld'" IgnoreExitCode="true" StandardOutputImportance="Low" ConsoleToMSBuild="true">
+    <Exec Command="&quot;$(CppLinker)&quot; -fuse-ld=lld -Wl,--version" Condition="'$(LinkerFlavor)' == 'lld'" IgnoreExitCode="true" StandardOutputImportance="Low" StandardErrorImportance="Low" IgnoreStandardErrorWarningFormat="true" ConsoleToMSBuild="true">
       <Output TaskParameter="ExitCode" PropertyName="_LinkerVersionStringExitCode" />
       <Output TaskParameter="ConsoleOutput" PropertyName="_LinkerVersionString" />
     </Exec>
MichalStrehovsky commented 6 months ago

Thank you! The binlog confirms that it comes from that line. The patch in your comment looks good to me - would you mind submitting a PR with it? I can do it myself, but I thought that since you already did the work, I shouldn't steal your credit.

jonpryor commented 6 months ago

@MichalStrehovsky: PR filed as: https://github.com/dotnet/runtime/pull/102000

am11 commented 6 months ago

This doesn't work because the $(RuntimeIdentifier) comparison is wrong, but if we fix that:

It was later corrected https://github.com/dotnet/runtime/issues/92272#issuecomment-2041640782 and afterwards the whole thing was removed in https://github.com/dotnet/runtime/pull/100755. The docs suggest this workaround for .NET 8 https://github.com/dotnet/runtime/blob/0fb0188a137f3d53a2ebd719d7a684327938609a/src/coreclr/nativeaot/docs/android-bionic.md#known-issues

jonpryor commented 6 months ago

@am11: I later noticed that.

The current problem is that the clang: error: … output shouldn't be emitted by default, as it's a "useless" error message which misleads developers.

PR #102000 attempts to improve its behavior.

agocke commented 5 months ago

I'm closing as this is expected based on our documentation: we do not support cross-os publishing with Native AOT. https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/cross-compile has the information.