dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.67k stars 4.58k forks source link

The system cannot open the device or file specified 'NuGet-Migrations' #91987

Open lonix1 opened 10 months ago

lonix1 commented 10 months ago

Description

In a CI build, in the official sdk docker image, I run dotnet nuget locals all --clear, and get this:

RUN dotnet nuget locals all --clear
System.IO.IOException: The system cannot open the device or file specified. : 'NuGet-Migrations'
   at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew)
   at System.Threading.Mutex..ctor(Boolean initiallyOwned, String name)
   at NuGet.Common.Migrations.MigrationRunner.Run()
   at Microsoft.DotNet.Configurer.DotnetFirstTimeUseConfigurer.Configure()
   at Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse(IFirstTimeUseNoticeSentinel firstTimeUseNoticeSentinel, IAspNetCertificateSentinel aspNetCertificateSentinel, IFileSentinel toolPathSentinel, Boolean isDotnetBeingInvokedFromNativeInstaller, DotnetFirstRunConfiguration dotnetFirstRunConfiguration, IEnvironmentProvider environmentProvider, Dictionary`2 performanceMeasurements)
   at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, TimeSpan startupTime, ITelemetry telemetryClient)
   at Microsoft.DotNet.Cli.Program.Main(String[] args)
ERROR: process "/bin/sh -c dotnet nuget locals all --clear" did not complete successfully: exit code: 1

Reproduction Steps

inside the sdk container:

RUN \
  export DOTNET_SKIP_FIRST_TIME_EXPERIENCE=1 && \
  dotnet nuget locals all --clear

Expected behavior

No error

Actual behavior

Error

Regression?

uknown

Known Workarounds

none

Configuration

mcr.microsoft.com/dotnet/sdk:7.0.400-bookworm-slim-amd64

Other information

Might be related to https://github.com/NuGet/Home/issues/12159 and https://github.com/dotnet/runtime/issues/80619, but those have been locked.

If I remove that offending line, I can build the docker image. If I then run that image and do something as simple as dotnet nuget it will give the same error as above.

{
  "ErrorMessage": "The system cannot open the device or file specified. : 'NuGet-Migrations'",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}

Known issue validation

Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=426482 Error message validated: The system cannot open the device or file specified. : 'NuGet-Migrations' Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 10/4/2023 1:44:58 AM UTC

Report

Build Definition Test Pull Request
759212 dotnet/runtime JIT.jit64.WorkItemExecution dotnet/runtime#105256
751881 dotnet/runtime JIT/jit64/rtchecks/overflow/overflow02_sub/overflow02_sub.sh dotnet/runtime#104958
746273 dotnet/runtime JIT/Directed/perffix/primitivevt/callconv3_il_d/callconv3_il_d.sh dotnet/runtime#104958
743409 dotnet/runtime Loader.classloader.WorkItemExecution dotnet/runtime#104985
742253 dotnet/runtime JIT/IL_Conformance/Old/Conformance_Base/bne_un_r8/bne_un_r8.sh
742005 dotnet/runtime JIT.jit64.opt.WorkItemExecution
741844 dotnet/runtime JIT.WorkItemExecution
741074 dotnet/runtime Loader.classloader.WorkItemExecution dotnet/runtime#104892
739723 dotnet/runtime JIT.jit64.WorkItemExecution
739510 dotnet/runtime JIT.Regression.WorkItemExecution dotnet/runtime#104565
739497 dotnet/runtime JIT.Generics.WorkItemExecution
739485 dotnet/runtime Loader.classloader.generics.WorkItemExecution
737840 dotnet/runtime JIT.Regression.CLR-x86-JIT.V1-M12-M13.WorkItemExecution dotnet/runtime#104748
735380 dotnet/runtime JIT/SIMD/VectorDot_r/VectorDot_r.sh
735190 dotnet/runtime Loader.classloader.WorkItemExecution
734820 dotnet/runtime baseservices.threading.WorkItemExecution dotnet/runtime#104631
729973 dotnet/runtime JIT.Generics.WorkItemExecution dotnet/runtime#103366
729670 dotnet/runtime Loader.classloader.WorkItemExecution dotnet/runtime#104330
729645 dotnet/runtime JIT.Methodical.f-iF-I.WorkItemExecution
728302 dotnet/runtime JIT.Directed.WorkItemExecution
727906 dotnet/runtime Loader.classloader.generics.WorkItemExecution dotnet/runtime#104311
727715 dotnet/runtime JIT.Methodical.eh.WorkItemExecution dotnet/runtime#103513
727718 dotnet/runtime JIT.Regression.CLR-x86-JIT.V1-M09-M11.WorkItemExecution dotnet/runtime#103512
727555 dotnet/runtime JIT.jit64.WorkItemExecution

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
1 2 24
lonix1 commented 10 months ago

Update: when I run the container as root, the problem disappears.

But even so, that error is misleading.

ghost commented 10 months ago

Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.

Issue Details
### Description In a CI build, in the official sdk docker image, I run `dotnet nuget locals all --clear`, and get this: ``` RUN dotnet nuget locals all --clear System.IO.IOException: The system cannot open the device or file specified. : 'NuGet-Migrations' at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew) at System.Threading.Mutex..ctor(Boolean initiallyOwned, String name) at NuGet.Common.Migrations.MigrationRunner.Run() at Microsoft.DotNet.Configurer.DotnetFirstTimeUseConfigurer.Configure() at Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse(IFirstTimeUseNoticeSentinel firstTimeUseNoticeSentinel, IAspNetCertificateSentinel aspNetCertificateSentinel, IFileSentinel toolPathSentinel, Boolean isDotnetBeingInvokedFromNativeInstaller, DotnetFirstRunConfiguration dotnetFirstRunConfiguration, IEnvironmentProvider environmentProvider, Dictionary`2 performanceMeasurements) at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, TimeSpan startupTime, ITelemetry telemetryClient) at Microsoft.DotNet.Cli.Program.Main(String[] args) ERROR: process "/bin/sh -c dotnet nuget locals all --clear" did not complete successfully: exit code: 1 ``` ### Reproduction Steps inside the sdk container: ``` RUN \ export DOTNET_SKIP_FIRST_TIME_EXPERIENCE=1 && \ dotnet nuget locals all --clear ``` ### Expected behavior No error ### Actual behavior Error ### Regression? uknown ### Known Workarounds none ### Configuration mcr.microsoft.com/dotnet/sdk:7.0.400-bookworm-slim-amd64 ### Other information Might be related to https://github.com/NuGet/Home/issues/12159 and https://github.com/dotnet/runtime/issues/80619, but those have been locked. If I remove that offending line, I can build the docker image. If I then run that image and do something as simple as `dotnet nuget` it will give the same error as above.
Author: lonix1
Assignees: -
Labels: `area-System.Threading`, `untriaged`, `needs-area-label`
Milestone: -
jkotas commented 10 months ago

cc @kouvel

kouvel commented 10 months ago

This could be similar to https://github.com/dotnet/runtime/issues/80619. If the container get set up as root or some other user and then a different user uses the container, a workaround may be to delete the /tmp/.dotnet/shm and /tmp/.dotnet/lockfiles directories as root as the last step before switching to a different user. There is a fix that is targeting the next servicing versions for .NET 7 and .NET 6 that may help if it's the same issue.

But even so, that error is misleading.

There is an issue for that, which would hopefully be addressed soon: https://github.com/dotnet/runtime/issues/89090

lonix1 commented 10 months ago

Thanks. I'll try the new version when it's released to see if it fixes this.

Is it one of those fixes that's released every month or so?

kouvel commented 10 months ago

Is it one of those fixes that's released every month or so?

Yes, the change should be in the next version that is released, 6.0.23 and 7.0.12.

carlossanlop commented 10 months ago

@kouvel this might not be a System.Threading specific issue.

I'm also seeing this failure in 7.0 CI runs for JIT HardwareIntrinsics in WASM like this one (completely unrelated):

Microsoft.DotNet.XUnitConsoleRunner v2.5.0 (64-bit .NET 7.0.10)
  Discovering: JIT.HardwareIntrinsics.XUnitWrapper (method display = ClassAndMethod, method display options = None)
  Discovered:  JIT.HardwareIntrinsics.XUnitWrapper (found 2 of 362 test cases)
  Starting:    JIT.HardwareIntrinsics.XUnitWrapper (parallel test collections = on, max threads = 2)
    JIT/HardwareIntrinsics/X86/Aes/Aes_r/Aes_r.sh [FAIL]
      System.IO.IOException: The system cannot open the device or file specified. : 'NuGet-Migrations'
         at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew)
         at System.Threading.Mutex..ctor(Boolean initiallyOwned, String name)
         at NuGet.Common.Migrations.MigrationRunner.Run()
         at Microsoft.DotNet.Configurer.DotnetFirstTimeUseConfigurer.Configure()
         at Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse(IFirstTimeUseNoticeSentinel firstTimeUseNoticeSentinel, IAspNetCertificateSentinel aspNetCertificateSentinel, IFileSentinel toolPathSentinel, Boolean isDotnetBeingInvokedFromNativeInstaller, DotnetFirstRunConfiguration dotnetFirstRunConfiguration, IEnvironmentProvider environmentProvider, Dictionary`2 performanceMeasurements)
         at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, TimeSpan startupTime, ITelemetry telemetryClient)
         at Microsoft.DotNet.Cli.Program.Main(String[] args)

      Return code:      1
      Raw output file:      /datadisks/disk1/work/B3D809D6/w/C9020AAE/uploads/Reports/JIT.HardwareIntrinsics/X86/Aes/Aes_r/Aes_r.output.txt
      Raw output:
      BEGIN EXECUTION
      Test Harness Exitcode is : 1
      To run the test:
      > set CORE_ROOT=/datadisks/disk1/work/B3D809D6/p
      > /datadisks/disk1/work/B3D809D6/w/C9020AAE/e/JIT/HardwareIntrinsics/X86/Aes/Aes_r/Aes_r.sh
      Expected: True
      Actual:   False
      Stack Trace:
           at JIT_HardwareIntrinsics._X86_Aes_Aes_r_Aes_r_._X86_Aes_Aes_r_Aes_r_sh()
           at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
           at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
      Output:
        System.IO.IOException: The system cannot open the device or file specified. : 'NuGet-Migrations'
           at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew)
           at System.Threading.Mutex..ctor(Boolean initiallyOwned, String name)
           at NuGet.Common.Migrations.MigrationRunner.Run()
           at Microsoft.DotNet.Configurer.DotnetFirstTimeUseConfigurer.Configure()
           at Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse(IFirstTimeUseNoticeSentinel firstTimeUseNoticeSentinel, IAspNetCertificateSentinel aspNetCertificateSentinel, IFileSentinel toolPathSentinel, Boolean isDotnetBeingInvokedFromNativeInstaller, DotnetFirstRunConfiguration dotnetFirstRunConfiguration, IEnvironmentProvider environmentProvider, Dictionary`2 performanceMeasurements)
           at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, TimeSpan startupTime, ITelemetry telemetryClient)
           at Microsoft.DotNet.Cli.Program.Main(String[] args)

        Return code:      1
        Raw output file:      /datadisks/disk1/work/B3D809D6/w/C9020AAE/uploads/Reports/JIT.HardwareIntrinsics/X86/Aes/Aes_r/Aes_r.output.txt
        Raw output:
        BEGIN EXECUTION
        Test Harness Exitcode is : 1
        To run the test:
        > set CORE_ROOT=/datadisks/disk1/work/B3D809D6/p
        > /datadisks/disk1/work/B3D809D6/w/C9020AAE/e/JIT/HardwareIntrinsics/X86/Aes/Aes_r/Aes_r.sh
  Finished:    JIT.HardwareIntrinsics.XUnitWrapper
=== TEST EXECUTION SUMMARY ===

kouvel commented 10 months ago

My understanding was that some of these tests in WASM have a build portion that runs as part of the test run. The exception is being thrown under Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse. It's a decent suspicion currently that it's the same issue, and once the SDK used in the CI is updated to patched versions hopefully these kind of issues would disappear.

carlossanlop commented 9 months ago

Not sure what's happening to KnownBuildError today, but it's not tagging all the hits (apparently it's just taking forever, but it works).

Here's another one from today, found in 6.0:

kouvel commented 9 months ago

That unfortunate. It seems the release/6.0 branch is currently using the latest runtime that would have the fix. The fix may not fix things if the permissions issue is retained, such as if this ran on a VM where an unfixed runtime was used, and would be unlikely if a container is being used (unless the permissions issue occurred during container setup). PR https://github.com/dotnet/runtime/pull/92603 added some additional diagnostics info, if this happens in .NET 9 CIs hopefully we'll get more info about what's happening.

carlossanlop commented 8 months ago

@kouvel One more hit in 6.0, now in the branding PR for 6.0.26:

carlossanlop commented 4 months ago

Continues affecting 6.0. Example: https://github.com/dotnet/runtime/pull/99787

Namanl2001 commented 4 months ago

@carlossanlop I'm facing the same issue with 8.0.2 do you know any workaround for this? thanks!

criemen commented 1 week ago

We're seeing

Running command in /home/runner/work/semmle-code/semmle-code/target/codeql-csharp-integration-tests/ql/csharp/ql/integration-tests/all-platforms/cshtml: [dotnet, build]
[2024-07-18 11:48:16] [build-stderr] System.IO.IOException: The system cannot open the device or file specified. : 'NuGet-Migrations'
[2024-07-18 11:48:16] [build-stderr]    at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew)
[2024-07-18 11:48:16] [build-stderr]    at System.Threading.Mutex..ctor(Boolean initiallyOwned, String name)
[2024-07-18 11:48:16] [build-stderr]    at NuGet.Common.Migrations.MigrationRunner.Run(String migrationsDirectory)
[2024-07-18 11:48:16] [build-stderr]    at Microsoft.DotNet.Configurer.DotnetFirstTimeUseConfigurer.Configure()
[2024-07-18 11:48:16] [build-stderr]    at Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse(IFirstTimeUseNoticeSentinel firstTimeUseNoticeSentinel, IAspNetCertificateSentinel aspNetCertificateSentinel, IFileSentinel toolPathSentinel, Boolean isDotnetBeingInvokedFromNativeInstaller, DotnetFirstRunConfiguration dotnetFirstRunConfiguration, IEnvironmentProvider environmentProvider, Dictionary`2 performanceMeasurements)
[2024-07-18 11:48:16] [build-stderr]    at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, TimeSpan startupTime, ITelemetry telemetryClient)
[2024-07-18 11:48:16] [build-stderr]    at Microsoft.DotNet.Cli.Program.Main(String[] args)

with 8.0.101 on our CI (linux, ubuntu 22.04, GitHub actions). Is there any progress on this issue?