dotnet / SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
MIT License
851 stars 285 forks source link

SNI.dll file locked when ASP.NET (Framework 4.8) App is running in IIS #354

Closed jakenuts closed 3 months ago

jakenuts commented 4 years ago

Describe the bug

The SNI.dll file is loaded/locked by IIS when using a Microsft.Data.SqlClient connection in a ASP.NET 4.8 Framework website but is not unloaded when the app is shutdown. Subsequent attempts to build or deploy the site are blocked by the error Unable to delete file "bin\x86\SNI.dll". Access to the path '...\bin\x86\SNI.dll' is denied. Every other dll in the bin folder can be deleted even while the web app is still running save for this one.

To reproduce

1.) Create an ASP.NET MVC website on the 4.8 Framework 2.) Add the Microsoft.Data.SqlClient package 3.) In the HomeController.Index method open a SqlConnection and execute an arbitrary command

using (var connection = new SqlConnection("<connection string here>"))
{
    connection.Open();
    new SqlCommand("SELECT 1", connection).ExecuteNonQuery();
}

4.) Cause the site to shutdown by modifying the web.config or stopping the debugger. 5.) The SNI.dll file remains locked and cannot be replaced or modified without shutting down IIS or the application pool associated with the website

Expected behavior

The SNI.dll should be unlocked (like the rest of the files in the bin folder) so that a site can be updated through deployment or build.

Further technical details

Microsoft.Data.SqlClient version: v1.1.0 .NET target: Framework 4.8 SQL Server version: SQL Azure Operating system: Windows 10 Professional / Windows Server 2016 Datacenter (Azure VM)

jakenuts commented 4 years ago

The SNI.dll library appears to be loaded in the SNINativeMethodWrapper static initializer but the handle is not kept or freed anywhere else in the code.

 IntPtr pDll = LoadLibrary(localFolder + subfolder + SNI);
 if (pDll == IntPtr.Zero)
 {
    throw new System.ComponentModel.Win32Exception("Failed to load " + localFolder + subfolder + SNI, 
   new System.ComponentModel.Win32Exception(Marshal.GetLastWin32Error()));
}
cheenamalhotra commented 4 years ago

@jakenuts

Could you also verify and confirm if same behaviour is observed with System.Data.SqlClient?

jakenuts commented 4 years ago

I tried removing the SqlClient packages, changing the using statement to using System.Data.SqlClient which creates a SqlConnection from the System.Data 4.0.0 framework assembly and no SNI.dll is copied to the bin folder so no issues there.

I tried installing the System.Data.SqlClient 4.8 package and it does not appear to rely on a SNI.dll for this 4.8 project and all the files in the bin folder can be deleted while the web app is running.

I then replaced that package with Microsoft.Data.SqlClient but kept the using statement as "using System.Data.SqlClient". This copied the SNI.dll to the bin/x86 folder but it is not loaded or locked when executing the command.

Changing the using statement to "using Microsoft.Data.SqlClient" and running the app caused the SNI.dll to be loaded and it is locked until I terminate the IIS process.

jakenuts commented 4 years ago

I tried some alternative setups where the data call DoData() was moved to a .NET Standard 2.0 library as that's the one my project uses. It demonstrated the "could not load file or assembly" error that shows up alot in Google searches, performed as expected when both project referenced System.Data.SqlClient 4.8, but moving to them both the Microsoft.Data.SqlClient replicates the file locking issue above.

[Setup 1]

ASP.NET NET 4.8 Project

.NET Standard 2.0 Library

No SNI dll in bin Calling DoData() from the webapp into the library generates

System.IO.FileNotFoundException: 'Could not load file or assembly 'System.Data.SqlClient, Version=4.6.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.'

[Setup 2]

ASP.NET NET 4.8 Project

.NET Standard 2.0 Library

No SNI dll in bin Calling DoData() from the webapp into the library succeeds, all files can be deleted

[Setup 3]

ASP.NET NET 4.8 Project

.NET Standard 2.0 Library

No SNI dll in bin Calling DoData() from the webapp fails with the 'could not load' exception

[Alternate Setup 4]

ASP.NET NET 4.8 Project

.NET Standard 2.0 Library

No SNI dll in bin Calling DoData() from the webapp fails with the 'could not load' exception

[Alternate Setup 4]

ASP.NET NET 4.8 Project

.NET Standard 2.0 Library

SNI dll copied to bin Calling DoData() succeeds, SNI.dll is locked until IIS process killed

jakenuts commented 4 years ago

Looking at the System.Data.SqlClient repository, there are some references to the necessity of holding onto and freeing the handle from SNI.dll (can't dig up where they load it). It doesn't appear that the handle loaded in Microsoft.Data.SqlClient is kept or later released which might explain why it sticks around.

From LocalDBAPI.Common.cs in System.Data.SqlClient

    //This is copy of handle that SNI maintains, so we are responsible for freeing it - therefore there we are not using SafeHandle
    private static IntPtr s_userInstanceDLLHandle = IntPtr.Zero;

   internal static void ReleaseDLLHandles()
   {
      s_userInstanceDLLHandle = IntPtr.Zero;
      s_localDBFormatMessage = null;
   }
cheenamalhotra commented 4 years ago

Thanks for the details @jakenuts

I just opened up System.Data.SqlClient .NET framework library source code and I didn't see any call to "LoadLibrary" there (since our code is imported from there), here's the only way it's loaded for every API:

[DllImport("SNI.dll", CallingConvention = CallingConvention.Cdecl, EntryPoint = "SNIAddProviderWrapper")]
internal static extern uint SNIAddProvider(SNIHandle pConn, ProviderEnum ProvNum, [In] ref uint pInfo);

Which is also present in our case in SNINativeMethodWrapper.cs

I'm trying to understand why we added LoadLibrary call in there, which seems to be primary difference and cause of this behaviour.

cheenamalhotra commented 4 years ago

Hi @jakenuts

[edit] Turns out it's necessary to load DLLs in Microsoft.Data.SqlClient as SNI.dll is external from NuGet package, unlike System.Data.SqlClient, I'll continue to investigate for possible solution, if any.

cheenamalhotra commented 4 years ago

Hi @jakenuts

I tried creating ASP.NET application to try reproduce issue but I cannot reproduce the problem using steps you mentioned above. I get file in use errors for "Roslyn" instead which I cannot delete, everything else I'm able to clean normally.

Application I tried: WebApplication1.zip

Can you provide me a simple application which you're able to reproduce with if I missed something as I'm new to ASP.NET MVC architecture.

jakenuts commented 4 years ago

Sorry, I should have included that the Roslyn folder cannot be deleted while the app is running. Here is a minimal ASP.NET 4.8 project that demonstrates the issue (at least on my machine). You'll need to run it on IIS or IIS Express to reproduce the locking. I'm also connecting to Azure SQL in my connection string (stupidly posted that earlier, thus the deletion) but have replaced that with the localhost one from yours. Later I'll take a look at the two projects and see if there are any relevant differences.

LockingWebApp.zip

jakenuts commented 4 years ago

Here's another sample app that is even more 'minimal'. It uses owin and creates a connection to a non-existent LocalDb which is enough to use the SNI.dll imports. Once you open the project, restore the packages and run the app it should demonstrate the problem. Replacing the using statement for Microsoft.Data.SqlClient with System.Data.SqlClient resolves the issue.

LockingWebApp.zip

cheenamalhotra commented 4 years ago

Hi @jakenuts

I tried your samples but still cannot reproduce any problem. I apparently have localdb installed, but I also tested with random server name in connection string just to trigger exception case. I don't see any locking problems with SNI.dll.

Just to confirm the steps:

  1. Open application in VS and run with IIS
  2. Stop debugging app
  3. Delete bin and obj folders manually
    • I can delete totally fine!

My Environment details are:

Did I miss something?

P.S. I'm also checking with my team internally if someone can reproduce in their environment.

jakenuts commented 4 years ago

Thanks! I tried loading the project on a different machine and for whatever reason it differs in that "stop debugging" or closing the browser window also closes IISExpress which then releases the DLL. On my machine IISExpress stays active even after I've stopped debugging.

When I tried with "Start without debugging" on the new machine, IISExpress stays active and when you try to rebuild, or if you delete all the rest of the files the SNI.dll remains locked. This replicates what is happening in production where full IIS remains running and once the dll is loaded it cannot be moved or updated until you kill the application pool.

jakenuts commented 4 years ago

I just tried creating an Azure App Service for the sample and published it there. Unfortunately it's way better at detecting changes than my VM deployment so it doesn't try to update the SNI.dll as they match so deployment succeeded without an issue. I was able to go into the console there and confirm that all the files in the bin directory could be deleted save for the SNI.dll though.

cheenamalhotra commented 4 years ago

Hi @jakenuts

Yes I can reproduce the behavior now. The behavior with "Start without Debugging" happens because application is running on IIS in a detached mode and is Active on IIS Server from "bin" directory. What you mean to do is rebuild and redeploy on top of Active application's "bin" directory. SNI.dll is locked in the process since it's a loaded library, from where you pointed above. Since the application is "Active", the DLL will stay loaded there.

If you stop the site, that would free the lock and SNI.dll can be deleted.

image

This seems to be expected behavior with Microsoft.Data.SqlClient.

With System.Data.SqlClient, you would not see this behavior because SNI code is inbuilt in "System.Data.dll" hence there was no external loading done. The Native SNI code in there is shared between all ADO.NET providers. This is not the case with Microsoft.Data.SqlClient as SNI.dll needs to be loaded from an external NuGet package "Microsoft.Data.SqlClient.SNI" in current process.

You can say that SqlClient (for .NET Framework) is incomplete without this referenced "SNI.dll". The reason why it's referenced externally from Microsoft.Data.SqlClient is that SNI native implementation continues to be a closed source, and is not open sourced yet.

cheenamalhotra commented 4 years ago

I was wondering if SNI.dll stayed loaded even after App was Shutdown/Stopped, as mentioned in the first comment, that doesn't seem to be the case.

jakenuts commented 4 years ago

It is a bit of a standout though as you can update/delete every file in a website (save for the roslyn folder) during deployment save for this one. Even after adding an App_Offline.htm file which is part of the standard deployment process the DLL remains locked so the only way to deploy is to stop the site entirely while the files are copied over. It's not a terrible workaround but it does mean that requests will come back with 500(?) responses instead of the App_Offline content which is slightly more polite.

Is it possible that the rest of the bin files are shadow copied somewhere such that they don't appear locked even when they loaded into the IIS process?

cheenamalhotra commented 4 years ago

I'm not an IIS expert, but there seem to be workarounds, with "Touch" tasks, you can try. I got some info from Developer Community Post.

Ideally, as discussed on Stack Overflow, the best approach is:

cheenamalhotra commented 4 years ago

I will close the issue as the behaviour is "By Design".

Thanks!

jakenuts commented 4 years ago

It seems impervious to the App_Offline method, but I can live with the "restart application pool" workaround as I'm hoping to eventually migrate away from the 4.8 legacy bits anyway. I really appreciate your help, thanks, have a great holiday!

vishnu4 commented 4 years ago

I'm surprised this issue is locked. while it might be 'as it worked before' with the previous version of sqlclient, it's not really ideal behavior, right? recycling the app pool isn't always reliable, and additionally I hit this issue frequently on my development box on just compiling.

cheenamalhotra commented 4 years ago

Hi @vishnu4

Firstly, this happens only with apps active or running in IIS, and you must stop the app safely in order to clean the bin directory and recompile.

Secondly, Microsoft.Data.SqlClient (.NET Framework) needs to load native library Microsoft.Data.SqlClient.SNI (SNI.dll) and since it is not part of Full Framework and is an external compoenent it needs to be loaded explicitly to work. It is not unloaded due to the fact that driver cannot function at all without SNI.dll, and as I mentioned above their separation was done for security and legal reasons.

This design is not the same as System.Data.SqlClient, so even if you may think 'it worked before for S.D.S', technically, their designs are very different from each other.

Microsoft.Data.SqlClient (.Net Core) on the other hand also references SNI.dll from runtime native libraries but they're part of .NET Core SDK so you'll not encounter issues there. And since moving forward .NET Core is the primary focus and .NET Framework is not longer active, we recommend users to port over to .NET Core if this issue becomes a big hurdle in development.

vishnu4 commented 4 years ago

I get that it happens when apps are active or running in IIS. i'm not debating that. I understand that the designs are very different as well, i'm not debating that either. I'm just saying my use cases are not edge or unique, and even though i don't have a solution to this issue, i don't think it should be closed or ignored.

The 2 places i see this issue are:

I understand i'm in .net framework, and that is a purgatory that i should try to get myself out of, but the move to use microsoft.data.sqlclient was part of that move for me, but it's actually making the move harder because of these issues. I have a large monolith .net framework app and i am gradually moving projects towards .net standard. This gradual move is what is causing my issue apparently.

So the solution is that i can't move gradually? I have to move everything to .net core immediately, all in one go without doing things in steps? That just doesn't seem realistic, which is the only point i'm trying to make here. This issue is making development more difficult, and the only solution seems to be to move everything BACK to .net framework so i can use System.Data.SQLClient instead, which prevents me from ever really moving to .net core.

.Net core is the primary focus. Great, totally down with that. Shouldn't that mean that aiding users from migrating from .Net Framework to .Net core should be important then?

cheenamalhotra commented 4 years ago

Reopening as we're considering a solution to allow unload of sni.dll - WIP

agevorg commented 4 years ago

Hi @cheenamalhotra , Do you have any ETA for this to be solved? We are waiting this issue to be solved to update our 2.1 version to 3.1.

chumacn commented 4 years ago

is this issue be solved? we are waiting for this. to update to 3.1

chumacn commented 4 years ago

for develop debug condition ,temporary solution:

before you rebuild project,execute the bat,then sni.dll will be unlocked

bat content: c:\windows\system32\inetsrv\AppCmd.exe stop apppool /apppool.name:"your pool name" ping -n 5 127.1>nul c:\windows\system32\inetsrv\AppCmd.exe start apppool /apppool.name:"your pool name"

cheenamalhotra commented 4 years ago

Hi @agevorg @chumacn

We investigated the same and currently the driver design does not support implementing unload behavior since the DLL is loaded in a static class instance. A great deal of changes are needed in the driver to support this, and that hasn't been prioritized yet.

Recommended solution with this design is to undeploy and re-deploy your app in IIS to perform safe unload of SNI.dll.

chumacn commented 4 years ago

Hi @agevorg @chumacn

We investigated the same and currently the driver design does not support implementing unload behavior since the DLL is loaded in a static class instance. A great deal of changes are needed in the driver to support this, and that hasn't been prioritized yet.

Recommended solution with this design is to undeploy and re-deploy your app in IIS to perform safe unload of SNI.dll.

so when update to fix?previous version work well。but the latest version do lock the dll

jimbromley-avanade commented 4 years ago

Hi @cheenamalhotra; Just chiming in that this issue is blocking others as well. As large projects are not always dynamic and able to shift as quickly there is an expectation of status quo from the behavior and deployment process. Please consider this another vote to raise the priority for this issue.

Thanks!

steinmr commented 4 years ago

Hi @cheenamalhotra, just chiming in with our experiences. Our case is slightly different, and harder to work around. We cannot currently move to full .NET Core for this application (it is big) and we cannot use the System.Data.SqlClient.dll either because we want to use EFCore.

We load libraries into separate appdomains, and later unload them to upgrade them without restarting the entire process. But since SNI.dll is locked, we are unable to update those libraries after the appdomain is unloaded. We are using shadow copy, but SNI.dll is not shadow-copied.

I have create a reproduction repo here: https://github.com/steinmr/SNILockedRepro

cheenamalhotra commented 4 years ago

Hi @steinmr @jimbromley-avanade

I understand your concerns. We have been investigating on making this experience better but the problem is the fact that this is Native/Unmanaged DLL that is loaded in the IIS process.

We also managed to remove the LoadLibrary() call (changes will be merged with PR #570) but because this is a Native DLL, once loaded by the application, it stays loaded.

We have limited experience with IIS but how do you normally work with Native DLLs and unload them from process without unloading them from managed code? Have you used solutions involving common directory like C:\Windows\System32\inetsrv outside your bin folder such that your local bin can be safely deleted?

I'm wondering because we're now moving away from strong path LoadLibrary(), and would be able to load DLL if it's loaded in the process from any directory, maybe this could bring a potential solution?

Maybe you can give PR #570 a try with shadow copy if that implies same?

steinmr commented 4 years ago

@cheenamalhotra Than you for your assistance. I tried the PR, but now I can not delete Microsoft.Data.SqlClient.SNI.x64.dll

cheenamalhotra commented 4 years ago

@steinmr

Yes the problem is going to stay if you use it as it is because DLLs are still going to be copied with the driver. As I said earlier, what's changed is we do not LoadLibrary() from strong path in the new design.

Have you tried to move Microsoft.Data.SqlClient.SNI.x64.dll to a different location that's also loaded in IIS, rather than keeping it in the same bin folder?

steinmr commented 4 years ago

@cheenamalhotra Hi, thank you for your continued help, really appreciated.

Just to be clear, I'm not using IIS, this is a simple console app.

We do not currently have any special handling for native DLLs, as far as I know all DLLs are either managed or part of the full .NET framework.

But with the following modifications to my example repo, I was able to "shadow copy" the native DLL, and thus it was loaded from the shadow-copied folder, instead of the bin folder. This means that it fixes my problem. This requires PR #570 But this feels pretty clunky, and I agree with @vishnu4 that such issues will be blockers for moving over to netcore, which is the end goal for all of us I think.

private static AppDomain CreateAppDomain(FileInfo file)
{
    var appDomainSetup = new AppDomainSetup { ShadowCopyFiles = "true" };
    var appDomain = AppDomain.CreateDomain(file.Name, null, appDomainSetup);
    appDomain.AssemblyLoad += AppDomain_AssemblyLoad;
    return appDomain;
}

private static void AppDomain_AssemblyLoad(object sender, AssemblyLoadEventArgs args)
{
    const string lib = "Microsoft.Data.SqlClient";
    if(args.LoadedAssembly.ManifestModule.Name == $"{lib}.dll")
    {
        foreach (var file in new string[] { $"{lib}.SNI.x64.dll", $"{lib}.SNI.x86.dll" })
        {
            var targetFile = Path.Combine(Path.GetDirectoryName(args.LoadedAssembly.Location), file);
            if(!File.Exists(targetFile)) // Shadow copy native DLL
                File.Copy(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, file), targetFile);
        }
    }
}
cheenamalhotra commented 4 years ago

Hi @jakenuts

That sounds great!

Just to clear your doubts, in .NET Core you will not have these issues as there we have runtime native DLLs which are loaded in process from NuGet Package cache folders and there is no transitive copy performed. You can give it a try in example apps and let us know if you face any issues.

These DLLs and issues associated with it are only applicable to .NET Framework and it is this transition phase that I believe is a little difficult to configure but with improvements we are coming with in #570 we hope to make the transition process smoother.

steinmr commented 4 years ago

Thanks @cheenamalhotra

Any plans on adding this to 1.1 in addition to 2.0?

cheenamalhotra commented 4 years ago

Hi @steinmr

This change required good amount of design changes in both SNI packaging and driver consumption and isn't a patch.. so no, backporting wouldn't be possible. This shall be available starting v2.0.

dmberezovskyi commented 4 years ago

Hello @cheenamalhotra, I faced the same issue on IIS. I updated the Microsft.Data.SqlClient and Microsft.Data.SqlClient.SNI to the 2.0.0 version but the Microsoft.Data.SqlClient.SNI.x64 is still locked by the IIS. Do you have any thoughts on why it happened?

ssg commented 4 years ago

This breaks our build pipeline because our unit tests encounter the same problem. We are trying to progressively migrate to .NET Core and migrating to EF Core was one of the planned milestones for that. We have no way to migrate to EF Core without this issue fixed. Is there any workaround?

cheenamalhotra commented 4 years ago

@berezovskii @ssg

The above comment(https://github.com/dotnet/SqlClient/issues/354#issuecomment-632312605) is a workaround to perform shadow copy of DLLs to a different folder and load DLLs from there. That way an App redeployment wouldn't be obstructed with loaded DLLs.

FreshPrinceMayo commented 4 years ago

Hi, faced this issue also.

If anyone is using TFS to deploy their applications I updated IIS Web App Deploy task as a workaround until a better solution is found.

-usechecksum

image

rbiq commented 4 years ago

Hi, is this issue resolved? I'm still receiving this error on ASPNET 4.8 using Microsoft.Data.SqlClient 2.0.1.

cheenamalhotra commented 4 years ago

@rbiq

The only resolution to this issue is to not have an external Native dependency in use at all, which is an ultimate long term goal for us but extremely unlikely in near sight future. Native SNI Dlls are very stable/performant in every way and it's extremely hard to replace them. Even if we start today, it will take a very long time of testing/improvements in order to build trust in customer applications before it's officially replaced, and this effort is not planned yet for .NET Framework. The Managed implementation we have for .NET Core driver is also under constant improvements/testing, but it has it's own challenges to overcome to be comparable with native implementation.

As of possible mitigations in current state, they are detailed in above discussions.

rbiq commented 4 years ago

@cheenamalhotra I guess I can continue to kill w3wp.exe in development environment to overcome the issue. However what i'm really trying to understand is the impact of this in production. Currently we're deploying manually to our product environment which entails pasting the bin folder manually. Will this cause any unforeseen issues in there? Also i believe this was a dependency on Microsoft.Data.SqlClient when I installed it from nuget in a .netstandard 2.0. This netstandard library is referenced in the aspnet 4.8 web app project. So if I replace this lib with good old System.Data.SqlClient will this issue go away? I think EF Core installed in the netstandard should still work with System.Data.SqlClient, amiright?

cheenamalhotra commented 4 years ago

I don't see any issues with manual deployment if it's designed this way. If you want it automated, you'd need some downtime to stop the application - deploy and restart, which is also possible solution. If you downgrade driver to System.Data.SqlClient, it would solve the bin directory issue because the driver is not loaded from there, but from framework itself. But you'll also have to also downgrade EF Core version and I wouldn't recommend doing that just for deployment ease.

The best advisable solution is to load SNI dlls from any another directory other than bin directory by shadow copying them to that directory. That will not block you to redeploy application as native DLLs will not be loaded from that location. The driver loads them to the bin directory as that's the only directory driver is eligible to access, but you can control that and place it to any other location that you don't expect to block you. Do remember to replace the SNI dlls when you upgrade version of Microsoft.Data.SqlClient (if SNI reference also updates), to make sure you're getting all updates from native side too.

rbiq commented 4 years ago

Understood. thanks for the response. I'm currently developing on NET48 classic Webforms aspx and deploying on IIS. To my understanding the AppDomain is more or less managed by IIS and so I'm not sure where I would apply the above steps to launch an appdomain and shadow copy the file. Can you please shed some light on it? thanks for all your help.

matrimsaric commented 4 years ago

This is going to make development irritating for a large team for the next few months :( Thankfully we are just switching to Standard as an intermediate step between moving to Core at some point next year so at least will be able to say 'it's only for a few months hopefully'

I find the best way to clear the lock is kill the w3wp.exe for the relevant user in TaskManager>>Details. Shutting the IIS site or stopping the relevant app pool does not prevent the issue (in our case we build into IIS from VS2019 so are often 'changing' code and re-building so will hit this a lot)

It might be possible to automate this to 'hide' the problem. So I will try to look at that next.

cheenamalhotra commented 4 years ago

@rbiq

It is not our expert area to advise on .NET WebForms and IIS application design. I'd recommend seeking advice on design ideas from experts in IIS application area. You may find some helpful tips from the discussion above.

matrimsaric commented 4 years ago

The cleanest way to drop the process lock was to force IIS reset in one of the project builds. Therefore the devs can essentially ignore.

iisreset /stop iisreset /restart Stops the error occurring and allows the build to occur. Might have secondary advantages in ensuring IIS doesn't have any hangups with previous operations as well.
matrimsaric commented 4 years ago

To avoid our devs having to manually kill the process we have added an iisreset command to one of the project builds which will nicely avoid the problem.

<PropertyGroup>
  <PreBuildEvent>iisreset /stop</PreBuildEvent>
</PropertyGroup>
  <PropertyGroup>
    <PostBuildEvent>
iisreset /restart
    </PostBuildEvent>
  </PropertyGroup>