dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.46k stars 10.03k forks source link

ANCM crashes with shadow copy enabled and the new deployment deleted a directory #48233

Closed someguy20336 closed 10 months ago

someguy20336 commented 1 year ago

Is there an existing issue for this?

Describe the bug

I am using the shadow copy feature in .NET 7. We got the following crash in the ANCM after deploying a change that deleted a directory under our wwwroot folder and the entire site was down with a 500 error until I added the empty directory back.

Application '/LM/W3SVC/6/ROOT' with physical root 'C:\Site\' failed to load coreclr. Exception message:
Unexpected exception: directory_iterator::directory_iterator: The system cannot find the path specified.: "C:\Site\wwwroot\lib"

However, I was able to eliminate the issue for subsequent deployments that might delete directories all together by using a pretty undocumented cleanShadowCopyDirectory setting found in #28357.

So it seems as if the ANCM is iterating on the shadow copied directory somewhere, but not checking if the folder actually exists in the newly deployed application.

I am not a C++ developer nor have I had a chance to actually run this code from source, but I have a hunch it is something to do with Environment::CheckUpToDate, which appears to be iterating directories without checking for existence. If I am reading it correctly:

Expected Behavior

If a directory is deleted in the deployed app, shadow copy should handle that and not crash.

Steps To Reproduce

I can try to put together a sample repo if needed, but it seems to be as simple as that.

Exceptions (if any)

directory_iterator::directory_iterator: The system cannot find the path specified.: <path to deleted directory>

.NET Version

2.2.200 (but this is on a server that only needs the runtime)

Anything else?

(Note: this is our IIS Server and we don't keep the SDK up to date, we just install the runtime. I don't feel like the SDK really has an impact here anyway - but let me know if I am misunderstanding that.)

NET Core SDK (reflecting any global.json): Version: 2.2.300 Commit: 73efd5bd87

Runtime Environment: OS Name: Windows OS Version: 10.0.17763 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\2.2.300\

Host: Version: 7.0.2 Architecture: x64 Commit: d037e070eb

.NET SDKs installed: 2.2.300 [C:\Program Files\dotnet\sdk]

.NET runtimes installed: Microsoft.AspNetCore.All 2.2.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.App 2.2.5 [C:\Program Files\dotnet\shared[Microsoft.AspNetCore.App](http://microsoft.aspnetcore.app/)] Microsoft.AspNetCore.App 3.1.14 [C:\Program Files\dotnet\shared[Microsoft.AspNetCore.App](http://microsoft.aspnetcore.app/)] Microsoft.AspNetCore.App 5.0.5 [C:\Program Files\dotnet\shared[Microsoft.AspNetCore.App](http://microsoft.aspnetcore.app/)] Microsoft.AspNetCore.App 6.0.9 [C:\Program Files\dotnet\shared[Microsoft.AspNetCore.App](http://microsoft.aspnetcore.app/)] Microsoft.AspNetCore.App 7.0.2 [C:\Program Files\dotnet\shared[Microsoft.AspNetCore.App](http://microsoft.aspnetcore.app/)] Microsoft.NETCore.App 2.2.5 [C:\Program Files\dotnet\shared[Microsoft.NETCore.App](http://microsoft.netcore.app/)] Microsoft.NETCore.App 3.1.14 [C:\Program Files\dotnet\shared[Microsoft.NETCore.App](http://microsoft.netcore.app/)] Microsoft.NETCore.App 5.0.5 [C:\Program Files\dotnet\shared[Microsoft.NETCore.App](http://microsoft.netcore.app/)] Microsoft.NETCore.App 6.0.9 [C:\Program Files\dotnet\shared[Microsoft.NETCore.App](http://microsoft.netcore.app/)] Microsoft.NETCore.App 7.0.2 [C:\Program Files\dotnet\shared[Microsoft.NETCore.App](http://microsoft.netcore.app/)]

Other architectures found: x86 [C:\Program Files (x86)\dotnet] registered at [HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\x86\InstallLocation]

Environment variables: Not set

global.json file: Not found

Learn more: https://aka.ms/dotnet/info

Download .NET: https://aka.ms/dotnet/download

amcasey commented 1 year ago

Do you want to take a look at this one, @mgravell? It's probably not related to #48296, but it's in the same vicinity.

imranbaloch commented 1 year ago

Thanks for the hint @someguy20336, I was scratching my head why suddenly my site stopped working when using Shadow Copy Deployment. It is so much annoying.

DanDiplo commented 1 year ago

@someguy20336 What value are you setting cleanShadowCopyDirectory to that prevents this error?

someguy20336 commented 1 year ago

Should just be true - you can see it used in #28357

Edit: ok yea they show false in that PR, but I am using true to do the clean

someguy20336 commented 1 year ago

Coming back in here for 2 things:

EDIT: removing the first one. Turns out to have been a permissions issue, though a little weird. We (allegedly) added the app pool user to a group that should have had full access to the shadow folder, but the site was hard crashing on startup. I could not personally verify this user was in the group, but I was able to add the app pool user directly to the shadow copy folder and it worked. So chalking it up to a fluke, but maybe I am overlooking some other permission issue. Don't really care at this point.

Second - any update on a possible fix plan?

BrennanConroy commented 11 months ago

You were 100% right about where the issue was. I assume it was a typo that we swapped current directory on every recursion 😆

PR open to fix the issue.