dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.45k stars 4.76k forks source link

System.IO.Ports.SerialPort not working on linux-musl-arm #104710

Closed cskowronnek closed 3 months ago

cskowronnek commented 4 months ago

Description

After waiting for the fix in #63187 I gave it a try but it unfortunately fails. With: Unable to load shared library 'libSystem.IO.Ports.Native' or one of its dependencies. In order to help diagnose loading problems, consider using a tool like strace. If you're using glibc, consider setting the LD_DEBUG environment variable: \nError loading shared library ld-linux-armhf.so.3: No such file or directory (needed by /app/libSystem.IO.Ports.Native.so)\nError loading shared library /usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-preview.6.24327.7/libSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library libSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library /app/liblibSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library /usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-preview.6.24327.7/liblibSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library liblibSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library /app/libSystem.IO.Ports.Native: No such file or directory

Dockerfile:

FROM mcr.microsoft.com/dotnet/sdk:9.0.100-preview.6-alpine3.20 AS build-env

ARG DOTNET_PLATFORM

WORKDIR /app

COPY *.csproj ./

RUN dotnet restore -r linux-musl-arm

COPY . ./
RUN dotnet publish -c Release -o out -r linux-musl-arm

FROM mcr.microsoft.com/dotnet/runtime:9.0.0-preview.6-alpine3.20-arm32v7
WORKDIR /app

COPY --from=build-env /app/out ./

ENTRYPOINT ["dotnet", "controller_module.dll"]

csproj

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net9.0</TargetFramework>
    <RootNamespace>Draeger.RSS.RentalRobot.LockerController</RootNamespace>
    <Nullable>enable</Nullable>
    <LangVersion>8.0</LangVersion>
    <WarningsAsErrors>nullable</WarningsAsErrors>
  </PropertyGroup>

  <ItemGroup>
    <ProjectCapability Include="AzureIoTEdgeModule" />
  </ItemGroup>

  <ItemGroup>
    <PackageReference Include="System.IO.Ports" Version="9.0.0-preview.6.24327.7" />

Logs

{"@t":"2024-07-10T09:05:46.9072401Z","@m":"Writing to serial port, attempt 1 of 3.","@i":"ff057a1a","@l":"Debug","SourceContext":"Controller.LockerDriver","iotHubDeviceId":"SOCP2001","iotHubName":"iothub","moduleName":"controller_module","DeployableName":"ControllerIotModule","DeployableType":"IotModule"}
{"@t":"2024-07-10T09:05:46.9180450Z","@m":"Opening serial port...","@i":"6dd64a39","@l":"Debug","SourceContext":"Controller.LockerDriver","iotHubDeviceId":"SOCP2001-21011111","iotHubName":"iothub","moduleName":"controller_module","DeployableName":"ControllerIotModule","DeployableType":"IotModule"}
{"@t":"2024-07-10T09:05:46.9424206Z","@m":"Unexpected error during open locker request, Unable to load shared library 'libSystem.IO.Ports.Native' or one of its dependencies. In order to help diagnose loading problems, consider using a tool like strace. If you're using glibc, consider setting the LD_DEBUG environment variable: \nError loading shared library ld-linux-armhf.so.3: No such file or directory (needed by /app/libSystem.IO.Ports.Native.so)\nError loading shared library /usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-preview.6.24327.7/libSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library libSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library /app/liblibSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library /usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-preview.6.24327.7/liblibSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library liblibSystem.IO.Ports.Native.so: No such file or directory\nError loading shared library /app/libSystem.IO.Ports.Native: No such file or directory\nError loading shared library /usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-preview.6.24327.7/libSystem.IO.Ports.Native: No such file or directory\nError loading shared library libSystem.IO.Ports.Native: No such file or directory\nError loading shared library /app/liblibSystem.IO.Ports.Native: No such file or directory\nError loading shared library /usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-preview.6.24327.7/liblibSystem.IO.Ports.Native: No such file or directory\nError loading shared library liblibSystem.IO.Ports.Native: No such file or directory\n","@i":"aedd1ac1","@l":"Error","SourceContext":"Controller.Domain.OpenLockerUseCase","iotHubDeviceId":"SOCP2001","iotHubName":"iothub","moduleName":"controller_module","DeployableName":"ControllerIotModule","DeployableType":"IotModule"}

Reproduction Steps

Build with newest System.IO lib for linux-musl-arm

Expected behavior

Lib working as expected

Actual behavior

Lib not found

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

cskowronnek commented 4 months ago

I'm one step further. I needed to copy /lib/ld-linux-armhf.so.3 to /app to get it working. Is there any directive to include /lib to the PATH or do I need to copy it in the Dockerfile?

dotnet-policy-service[bot] commented 4 months ago

Tagging subscribers to this area: @dotnet/area-system-io-ports See info in area-owners.md if you want to be subscribed.

janvorli commented 4 months ago

I needed to copy /lib/ld-linux-armhf.so.3 to /app to get it working.

That means that the shared library is still compiled for glibc and not for MUSL. This file provides an emulation layer on MUSL based distros. I think that adding /lib to the LD_LIBRARY_PATH may also fix the problem without any copying. While it may be sufficient as a hotfix, we need to fix the actual issue.

@wfurt it seems that your fix #92145 wasn't sufficient.

jeffhandley commented 3 months ago

@wfurt Can you take a look at this for 9.0.0 still, or would you like help from @krwq?

wfurt commented 3 months ago

I was able to take a look and it seems like SDK problem to me. Here is what I see on trivial serial test

~/serial # ~/dotnet/dotnet restore -r linux-musl-arm
Restore complete (29.1s)

Build succeeded in 31.2s
~/serial # ~/dotnet/dotnet publish --sc  -r linux-musl-arm
Restore complete (6.6s)
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  serial succeeded (37.1s) → bin/Release/net9.0/linux-musl-arm/publish/

and ldd fails for me as:

~/serial # ldd bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so
      /lib/ld-musl-armhf.so.1 (0x76eb6000)
      libc.so.6 => /lib/ld-musl-armhf.so.1 (0x76eb6000)
Error loading shared library ld-linux-armhf.so.3: No such file or directory (needed by bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so)
Error relocating bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so: __ioctl_time64: symbol not found

now here is the interesting part:

~/serial # md5sum   bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so
34a0b099e16105a70fb7e1ba9fa87ebf  bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so

~/serial # find ~/.nuget/packages/ -name libSystem.IO.Ports.Native.so  | xargs md5sum
da1d15431229c1dff1091cb76c956076  /root/.nuget/packages/runtime.linux-musl-x64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-musl-x64/native/libSystem.IO.Ports.Native.so
daef2eb30d57f9857b917b091434f67a  /root/.nuget/packages/runtime.android-arm64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/android-arm64/native/libSystem.IO.Ports.Native.so
969cd193e4aa326ec95be5cbd3fc4975  /root/.nuget/packages/runtime.linux-bionic-x64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-bionic-x64/native/libSystem.IO.Ports.Native.so
909ab0830d9d1979c8c7eca5f461cf10  /root/.nuget/packages/runtime.android-x86.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/android-x86/native/libSystem.IO.Ports.Native.so
86e77566d8000fc558da40218df8a6a7  /root/.nuget/packages/runtime.linux-musl-arm64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-musl-arm64/native/libSystem.IO.Ports.Native.so
d2164712b52dc33ae9553916b3115861  /root/.nuget/packages/runtime.linux-bionic-arm64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-bionic-arm64/native/libSystem.IO.Ports.Native.so
c2f86ecf760367e82210ef19877dbd71  /root/.nuget/packages/runtime.linux-musl-arm.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-musl-arm/native/libSystem.IO.Ports.Native.so
a63242da99a8fc64a4cf1036f9db31f8  /root/.nuget/packages/runtime.linux-arm.runtime.native.system.io.ports/8.0.0/runtimes/linux-arm/native/libSystem.IO.Ports.Native.so
34a0b099e16105a70fb7e1ba9fa87ebf  /root/.nuget/packages/runtime.linux-arm.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-arm/native/libSystem.IO.Ports.Native.so

so it pulls down version from linux-arm despite the RID explicitly specified.

/serial # cp  /root/.nuget/packages/runtime.linux-musl-arm.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-musl-arm/native/libSystem.IO.Ports.Native.so  bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so
/serial # ldd  bin/Release/net9.0/linux-musl-arm/libSystem.IO.Ports.Native.so
    /lib/ld-musl-armhf.so.1 (0x76e7d000)
    libc.musl-armv7.so.1 => /lib/ld-musl-armhf.so.1 (0x76e7d000)

~/serial # ./bin/Release/net9.0/linux-musl-arm/serial
Unhandled exception. System.UnauthorizedAccessException: Access to the port '/dev/null' is denied.
 ---> System.IO.IOException: Not a tty
   --- End of inner exception stack trace ---
   at System.IO.Ports.SafeSerialDeviceHandle.Open(String portName)
   at System.IO.Ports.SerialStream..ctor(String portName, Int32 baudRate, Parity parity, Int32 dataBits, StopBits stopBits, Int32 readTimeout, Int32 writeTimeout, Handshake handshake, Boolean dtrEnable, Boolean rtsEnable, Boolean _1, Byte _2)
   at System.IO.Ports.SerialPort.Open()

so when I force correct package file the app runs (I don't have serial port so I use /dev/null as placeholder)

perhaps @akoeplinger or @ViktorHofer would have idea what is going on. This is different from #63187 where we did not have the binaries published. Here we do but the build pulls in wrong version.

jkotas commented 3 months ago

This is authoring problem in the runtime.native.system.io.ports package v8.0. runtime.native.system.io.ports.nuspec in the root of the package does not list the musl platform specific packages. From runtime.native.system.io.ports.nuspec :

      <group targetFramework=".NETStandard2.0">
        <dependency id="runtime.linux-arm.runtime.native.System.IO.Ports" version="8.0.0" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-arm64.runtime.native.System.IO.Ports" version="8.0.0" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-x64.runtime.native.System.IO.Ports" version="8.0.0" exclude="Build,Analyzers" />
        <dependency id="runtime.osx-arm64.runtime.native.System.IO.Ports" version="8.0.0" exclude="Build,Analyzers" />
        <dependency id="runtime.osx-x64.runtime.native.System.IO.Ports" version="8.0.0" exclude="Build,Analyzers" />
      </group>

v9.0 version of the package has it fixed:

      <group targetFramework=".NETStandard2.0">
        <dependency id="runtime.android-arm.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.android-arm64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.android-x64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.android-x86.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-arm.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-arm64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-bionic-arm64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-bionic-x64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-musl-arm.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-musl-arm64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-musl-x64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.linux-x64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.maccatalyst-arm64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.maccatalyst-x64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.osx-arm64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
        <dependency id="runtime.osx-x64.runtime.native.System.IO.Ports" version="9.0.0-rc.1.24407.7" exclude="Build,Analyzers" />
      </group>
akoeplinger commented 3 months ago

@jkotas the 9.0.0-preview.6.24327.7 version of the package which @wfurt used does have that change though...

akoeplinger commented 3 months ago

Yeah this looks like an sdk bug to me, the ResolvePackageAssets task resolves both the linux-arm64 and the linux-musl-arm64 file:

image

Then the ResolvePackageFileConflicts task just picks one of them and it picks the wrong one:

Encountered conflict between 'CopyLocal:/Users/alexander/.nuget/packages/runtime.linux-arm64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-arm64/native/libSystem.IO.Ports.Native.so' and 'CopyLocal:/Users/alexander/.nuget/packages/runtime.linux-musl-arm64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-musl-arm64/native/libSystem.IO.Ports.Native.so'. Choosing 'CopyLocal:/Users/alexander/.nuget/packages/runtime.linux-arm64.runtime.native.system.io.ports/9.0.0-preview.6.24327.7/runtimes/linux-arm64/native/libSystem.IO.Ports.Native.so' arbitrarily as both items are copy-local and have equal file and assembly versions.

akoeplinger commented 3 months ago

Actually wait, this is probably working as designed since the RID graph considers linux-musl-arm64 -> linux-arm64 -> linux and due to the way these packages are written where runtime.native.system.io.ports depends on all of the RID-specific packages it makes sense that it selects the assets from both packages, because in the runtime.linux-arm64.runtime.native.System.IO.Ports it doesn't find a linux-musl-arm64 asset and falls back to using the linux-arm64 one.

akoeplinger commented 3 months ago

I see two ways to fix this: 1) Move to the runtime.json approach like we do for some other packages 2) Only use a single runtime.linux-arm64.runtime.native.System.IO.Ports package that contains both glibc and musl assets in the runtimes directory in the package so the SDK selects the correct one.

@ericstj thoughts?

akoeplinger commented 3 months ago

Oh and we have the same issue when publishing for linux-bionic-arm64 which we added in https://github.com/dotnet/runtime/pull/95749 but most users publish for Android RIDs and bionic is compatible with that so it wasn't noticed.

jkotas commented 3 months ago

a single runtime.linux-arm64.runtime.native.System.IO.Ports package that contains both glibc and musl assets

Would this create new joins in the build?

I think runtime.json is the only solution that works well for these types of packages. I know we have tried to migrate away from runtime.json since it is undocumented feature, but the alternatives do not work.

akoeplinger commented 3 months ago

Would this create new joins in the build?

I think so, good point.

I think runtime.json is the only solution that works well for these types of packages

One case where it doesn't work is if you run just a dotnet build since there's no RID set and so the native libraries don't get copied to the output directory. That's not a problem for the other packages that currently use runtime.json since we expect a RID to be set in those scenarios, but that's not the case here I think.

akoeplinger commented 3 months ago

I found another option, we can add this target into the build:

  <Target Name="FixPackageConflict" AfterTargets="ResolveTargetingPackAssets" Condition="'$(RuntimeIdentifier)' != ''">
    <PropertyGroup>
      <PackageConflictPreferredPackages>$(PackageConflictPreferredPackages);runtime.$(RuntimeIdentifier).runtime.native.system.io.ports</PackageConflictPreferredPackages>
    </PropertyGroup>
  </Target>

this will cause the conflict resolution to prefer the correct package

ericstj commented 3 months ago

Runtime.json doesn't work at all for framework-dependent apps. It's broken by design, not just undocumented.

Can we add an empty folder to runtime.linux-arm64.runtime.native.system.io.ports as follows:

runtimes/linux-musl-arm64/native/_._

That will ensure that the glibc asset should never be chosen when someone builds for musl. Probably all of the runtime packages for glibc rids need this. It's basically a hack to avoid the join, but get the behavior as if we had joined into a single package.

PS: nice workaround @akoeplinger glad you discovered that.

akoeplinger commented 3 months ago

ohh I like that idea, that sounds really simple

akoeplinger commented 3 months ago

PR: https://github.com/dotnet/runtime/pull/106231

wfurt commented 2 months ago

can you the updated package @cskowronnek ?