Open solazs opened 4 years ago
dotnet tool restore -v diag
output
dotnet_tool_diag.log
Moving this to the SDK repo since it doesn't seem specific to EF.
This started to happen on our build agents (but not all it seems) two days ago too but for the tool smtp4dev
: The exact same tool restore
then tool run
commands that worked before now fail. And actually it's the same for me locally too.
Did you find a solution @solazs?
Maybe worth noting that I'm talking about a local tool. If I do a complete reinstall of the tool in a new folder with the following commands it's still broken:
dotnet new tool-manifest
dotnet tool install Rnwood.Smtp4dev --version "3.1.0-*"
dotnet tool restore
dotnet tool run smtp4dev --db= "" --smtpport 7366 --urls http://localhost:8394/
@Piedone
Did you find a solution @solazs?
Unfortunately no, we're still impacted and I cannot reproduce the issue reliably. Global install (--global
) might work in other environments, for us it's not feasible.
Thanks for your quick reply. Same here.
I dug through the log and didn't find anything noteworthy. I see you're on *nix, I'm on Windows, so that's one thing that shouldn't make a difference I guess. Although you redacted your user name, can it be that this has to do with paths containing accented characters since we're both Hungarian with such a name? (The temp folder these commands use are under the user folder in Windows, which in my case is "Zoltán".)
Hey,
Nope, username is pretty plain, nothing Hungarian in there, but good thinking, nonetheless :) Project name is simple as well.
Also, we have multiple and supposedly identical agents running this build, but the error does not happen reliably, so I think this issue will have something to do with caches and tool install locations being funny (e.g. left in an inconsistent state).
That can be. It seems that two of our build agents are affected, one isn't, though they're all running in subfolders on the same physical machine (so paths pretty much only differ in a number). No accented paths there anywhere though keyboard/date settings are Hungarian. However, changing the culture doesn't seem to help.
We can reproduce it reliably though.
It works for one of my colleagues but I don't yet see why (same SDK version, same culture, same Windows).
Clearing NuGet caches with dotnet nuget locals all --clear
doesn't resolve the issue BTW.
However, I did solve it: On Windows under %USERPROFILE%.dotnet\toolResolverCache\ you have subfolders with files that refer to the NuGet package cache folder. If those are wrong you'll get this error.
Because it turns out that this was caused by me changing the NuGet package cache folder's path (both in the user-level NuGet.config and in the environment variable). While I supposedly changed the path correctly and everything else worked it did break local tools. The tests failing in our CI environment were actually caused by something else but curiously both the failing tests and them starting failing coincided by the issue surfacing on my machine...
Confirm on the cache clearing not solving the issue. I have tried that prior to opening this issue.
Moreover content of ~/.dotnet/toolResolverCache
is identical on our build agents.
Edit: Identical, meaning it shouldn't cause one build to fail and another to succeed.
$ ls -al ~/.dotnet/toolResolverCache/
total 12
drwxr-xr-x 3 myuser myuser 4096 Apr 9 13:46 .
drwxr-xr-x 5 myuser myuser 4096 Jun 19 16:17 ..
drwxr-xr-x 2 myuser myuser 4096 Apr 9 13:46 1
$ ls -al ~/.dotnet/toolResolverCache/1/
total 12
drwxr-xr-x 2 myuser myuser 4096 Apr 9 13:46 .
drwxr-xr-x 3 myuser myuser 4096 Apr 9 13:46 ..
-rw-r--r-- 1 myuser myuser 713 Jun 25 14:04 dotnet-ef
@solazs Did you ever get to the bottom of this?
We are seeing this same issue on Windows 10 pipeline builds as well. Currently we are doing the following to workaround it:
if (HasEnvironmentVariable("USERPROFILE") && DirectoryExists(EnvironmentVariable("USERPROFILE") + "/.dotnet/toolResolverCache"))
{
DeleteDirectory(EnvironmentVariable("USERPROFILE") + "/.dotnet/toolResolverCache", new DeleteDirectorySettings { Recursive = true, Force = true});
DotNetCoreTool("tool restore");
}
But this kind of defeats the purpose of having a toolResolverCache. If the toolResolverCache can't be done reliably maybe this caching should just be removed from dotnet.
@solazs Did you ever get to the bottom of this?
Nope, we blacklisted the affected build agents from the problematic build. Did not receive complaints since.
I ran into something very similar recently and it came down to the user context that the build agent was running in. This case https://github.com/dotnet/sdk/issues/16165 describes the pit I fell into quite nicely.
This is how the dotnet tool becomes unavailable after dotnet tool restore
in my Azure Devops pipeline:
TL;DR: The tool specific file under <userprofile>\.dotnet\toolResolverCache\1
refers to files or folders that no longer exists.
buildsvc
.UseDotNet
. This changes where the nuget packages/dotnet tool assemblies are stored to the working folder of the agent.dotnet tool restore
. Since there is no corresponding tool described in the c:\users\buildsvc\.dotnet\toolResolverCache\1
folder, the tool is downloaded to the agent working folder (c:\w\1\.nuget\
) and the download location is added to the PathToExecutable
node in the tool specification file.dotnet tool restore
. Now the corresponding tool is described in c:\users\buildsvc\.dotnet\toolResolverCache\1
, so the tool is not downloaded again. The path to the assemblies is pointing to a now empty folder in the agent working directory. Still, the output of the restore command is that the restore was successful.it's proving difficult to find the freshest, most relevant github issue for this bug, but we've seen an enormous uptick in issues with dotnet tool restore
in the past week and especially 48 hours that I believe follow the pattern seen here (or possibly in the various referenced issues):
00:00:58
Installing cake.tool
00:01:02
Tool 'cake.tool' (version '4.0.0') was restored. Available commands: dotnet-cake
00:01:02
Tool 'nuke.globaltool' (version '7.0.6') was restored. Available commands: nuke
00:01:02
Restore was successful.
00:01:03
Running build script...
00:01:03
& dotnet cake "C:\BuildAgent\work\b2fcdbe1759cb61d\Build\build.cake" <args>
00:01:03
Run "dotnet tool restore" to make the "dotnet-cake" command available.
00:01:06
Error message: Run "dotnet tool restore" to make the "dotnet-cake" command available.
we've been running this in CI on ephemeral build agents for months without issue, and only recently began experiencing an uptick in failures like this with no changes at all on the ephemeral build agents. We've explored a variety of updates to the build agents' base VM image to address this that have seemingly only made things worse (i.e. this issue more frequent).
1) is this the best issue in the dotnet repo for tracking restore problems? 2) any guesses why this would fail intermittently? 3) are there any robust work-arounds?
We have an interesting issue on one of our build agents regarding dotnet ef cli tool. The error only occurs on one of our agents, however I cannot figure out what's off.
Running
dotnet tool restore
in my project directory outputsHowever when trying to run dotnet-ef I get an error:
Further technical details
dotnet-ef version: 3.1.4 dotnet --version: 3.1.300 Target framework: .NET Core 3.1.4 Operating system: Ubuntu 18.04.1 LTS
I'll attach a verbose output from dotnet tool restore separately.