dotnet / sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
https://dot.net/core
MIT License
2.69k stars 1.06k forks source link

dotnet global tools failing to uninstall or update on 2.1-sdk-alpine #3838

Open MichaelSimons opened 4 years ago

MichaelSimons commented 4 years ago

@RyanWhite25 commented on Wed Oct 30 2019

Steps to reproduce the issue

  1. Create a basic dockerfile as below
    
    FROM microsoft/dotnet:2.1-sdk-alpine

RUN dotnet tool install -g Amazon.Lambda.Tools

CMD ["/bin/sh"]


2. Build a Docker image and run a container 

docker build

docker container run -it


3. Attempt to uninstall or update the nuget package

dotnet tool update -g Amazon.Lambda.Tools

dotnet tool uninstall -g Amazon.Lambda.Tools


## Expected behavior

The package is updated or uninstalled

## Actual behavior

An error message is thrown:

Tool 'amazon.lambda.tools' failed to update due to the following: Failed to uninstall tool package 'amazon.lambda.tools': Cross-device link


## Additional information (e.g. issue happens only occasionally)

I've ran into an issue when uninstalling or updating nuget packages on microsoft/dotnet:2.1-sdk-alpine, when baked into a docker image via a dockerfile.

I've been able to replicate on different host machines, and only seems to occur when the package has been pre-installed on the image. Running the base container interactively and installing allows me to update and uninstall the package as expected.

I've used Amazon.Lambda.Tools as an example, but this seems to also occur on the several other packages I've tested.

It also seems to occur when using --tools-path instead of global.

A seemingly similar issue has also been reported on StackOverflow here:
https://stackoverflow.com/questions/57792546/docker-uninstall-dotnet-global-tool-installed-through-dockerfile

I'm specifically seeing this issue when trying to build PowerShell Core AWS Lambda functions using New-AWSPowerShellLambdaPackage, which is failing when it can't update the existing installation of Amazon.Lambda.Tools:
https://github.com/aws/aws-lambda-dotnet/blob/master/PowerShell/Module/Private/_DeploymentFunctions.ps1#L314

Thanks for any information or assistance you can provide with this.

## Output of `docker version`

Client: Version: 18.06.1-ce API version: 1.38 Go version: go1.10.3 Git commit: e68fc7a215d7133c34aa18e3b72b4a21fd0c6136 Built: Mon Jul 1 18:51:44 2019 OS/Arch: linux/amd64 Experimental: false

Server: Engine: Version: 18.06.1-ce API version: 1.38 (minimum version 1.12) Go version: go1.10.3 Git commit: e68fc7a/18.06.1-ce Built: Mon Jul 1 18:53:20 2019 OS/Arch: linux/amd64 Experimental: false


## Output of `docker info`

Containers: 20 Running: 1 Paused: 0 Stopped: 19 Images: 28 Server Version: 18.06.1-ce Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e runc version: 69663f0bd4b60df09991c08812a60108003fa340 init version: fec3683 Security Options: seccomp Profile: default Kernel Version: 4.14.138-114.102.amzn2.x86_64 Operating System: Amazon Linux 2 OSType: linux Architecture: x86_64 CPUs: 1 Total Memory: 983.7MiB Name: ip-172-31-11-191.eu-west-1.compute.internal ID: J53Z:IZEG:YQT5:E2KR:SWNZ:JTHZ:QQCE:3ZWZ:VX4M:IDVJ:5W5B:6LG5 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false



---

@mthalman commented on [Thu Oct 31 2019](https://github.com/dotnet/dotnet-docker/issues/1426#issuecomment-548403123)

I've been able to repro this issue.  I've also confirmed that the same error occurs for both 2.2 and 3.0 SDKs.

@RyanWhite25 - Although this is unrelated to the issue you encountered, we recommend that you update your .NET image references to use the [MCR Docker registry](https://devblogs.microsoft.com/dotnet/net-core-container-images-now-published-to-microsoft-container-registry/).  In your case, `microsoft/dotnet:2.1-sdk-alpine` would be replaced with `mcr.microsoft.com/dotnet/core/sdk:2.1-alpine`. These new tags are all published on [Docker Hub](https://hub.docker.com/_/microsoft-dotnet-core).

---

@MichaelSimons commented on [Thu Oct 31 2019](https://github.com/dotnet/dotnet-docker/issues/1426#issuecomment-548415206)

This issue is also not specific to the `alpine` images, you can reproduce it with Debian for example (e.g. `mcr.microsoft.com/dotnet/core/sdk:3.0`).  Given the `Cross-device link` error, I am speculating this is related to Docker layering and the overlay mount that is created that unions all of the image layers together.  For example if you install and uninstall the tool within a single layer everything works fine.  Does the `dotnet tool` infrastructure rely on hard links?  Moving to the SDK team to investigate.
MichaelSimons commented 4 years ago

CC @wli3

wli3 commented 4 years ago

Does the dotnet tool infrastructure rely on hard links?

It does not. It is a simple folder delete when uninstall

MichaelSimons commented 4 years ago

Running this scenario with diagnostic output yields the following:

Microsoft.DotNet.ToolPackage.ToolPackageException: Invalid cross-device link
 ---> System.IO.IOException: Invalid cross-device link
   at System.IO.FileSystem.MoveDirectory(String sourceFullPath, String destFullPath)
   at System.IO.Directory.Move(String sourceDirName, String destDirName)
   at ToolPackageUninstaller.<>c__DisplayClass2_1.<Uninstall>b__3()
   at Microsoft.DotNet.Cli.Utils.FileAccessRetrier.RetryOnMoveAccessFailure(Action action)
   at ToolPackageUninstaller.<>c__DisplayClass2_0.<Uninstall>b__0()
   --- End of inner exception stack trace ---
   at ToolPackageUninstaller.<>c__DisplayClass2_0.<Uninstall>b__0()
   at Microsoft.DotNet.Cli.TransactionalAction.<>c__DisplayClass2_0.<Run>b__0()
   at Microsoft.DotNet.Cli.TransactionalAction.Run[T](Func`1 action, Action commit, Action rollback)
   at Microsoft.DotNet.Cli.TransactionalAction.Run(Action action, Action commit, Action rollback)
   at ToolPackageUninstaller.Uninstall(DirectoryPath packageDirectory)
   at Microsoft.DotNet.Tools.Tool.Uninstall.ToolUninstallGlobalOrToolPathCommand.Execute()
Failed to uninstall tool package 'amazon.lambda.tools': Invalid cross-device link

It looks like this is caused by the logic to move the global tool package directory to a staging directory - https://github.com/dotnet/cli/blob/release/3.0.1xx/src/dotnet/ToolPackage/ToolPackageUninstaller.cs#L37

Directory.Move does not support mounts on linux - Per the documentation IOException An attempt was made to move a directory to a different volume.

There is at least one open issue request Directory.Move mount support - https://github.com/dotnet/corefx/issues/41734

wli3 commented 4 years ago

@MichaelSimons thank you for look into it!

So different layer is implemented with file mount in docker? It feels like a docker's implementation leak out. I need to think more about it.

wli3 commented 4 years ago

Directory.Move is used since it is an atomic operation. And that basically means you cannot do an atomic operation between docker layers which could be anything

MichaelSimons commented 4 years ago

@wli3 - Docker creates an overlay mount which merges all of the Docker image layers.

Am I wrong in thinking that the best solution here is for Directory.Move to support mounts (e.g. https://github.com/dotnet/corefx/issues/41734)?

From a priority perspective, the scenario involving uninstalling a dotnet tool that was previously installed in a Docker layer seems like a low priority. The scenario involving upgrading a dotnet tool that was installed in a previous Docker layer is higher in priority but I am not sure how common that would be either. I'm speaking from a scenario perspective here, ignoring the fact that upgrade tool implementation relies on uninstalling the previous version.

wli3 commented 4 years ago

Am I wrong in thinking that the best solution here is for Directory.Move to support mounts

That would be the best. But I am not sure it is possible (or very hard) to do an atomic Move between drives.