dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.26k stars 9.96k forks source link

Application not recovering when Startup throws exception #9202

Closed rogerfar closed 5 years ago

rogerfar commented 5 years ago

We have some test servers that reboot themselves every night. The web server reboots but it up before the SQL server is up. The .NET Core 2.2 application gets automatically started with the "Application Initialization" feature in IIS.

When the application starts automatically it runs some initialization in the Startup.cs Configure method. But because the database is unreachable an exception is thrown:

Application startup exception: System.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) ---> System.ComponentModel.Win32Exception (1231): The network location cannot be reached. For information about network troubleshooting, see Windows Help
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
   at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at System.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
   at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
   at System.Data.SqlClient.SqlConnection.Open()
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenDbConnection(Boolean errorsExpected)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.Open(Boolean errorsExpected)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerDatabaseCreator.<>c__DisplayClass18_0.<Exists>b__0(DateTime giveUp)
   at Microsoft.EntityFrameworkCore.ExecutionStrategyExtensions.<>c__DisplayClass12_0`2.<Execute>b__0(DbContext c, TState s)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
   at Microsoft.EntityFrameworkCore.ExecutionStrategyExtensions.Execute[TState,TResult](IExecutionStrategy strategy, Func`2 operation, Func`2 verifySucceeded, TState state)
   at Microsoft.EntityFrameworkCore.ExecutionStrategyExtensions.Execute[TState,TResult](IExecutionStrategy strategy, TState state, Func`2 operation)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerDatabaseCreator.Exists(Boolean retryOnNotExists)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerDatabaseCreator.Exists()
   at Microsoft.EntityFrameworkCore.Migrations.HistoryRepository.Exists()
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.Migrate(String targetMigration)
   at Microsoft.EntityFrameworkCore.RelationalDatabaseFacadeExtensions.Migrate(DatabaseFacade databaseFacade)
   at App.Internal.Data.DataContext.Configure(IApplicationBuilder app) in C:\TeamCity\buildAgent\work\1057292a3d53e562\App.Internal\Data\DataContext.cs:line 36
   at App.Web.Startup.Configure(IApplicationBuilder app, IHostingEnvironment env, ICodeExecutor codeExecutor, IScheduler scheduler, ILogger logger, IOptions`1 config) in C:\TeamCity\buildAgent\work\1057292a3d53e562\App.Web\Startup.cs:line 142
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.AspNetCore.Hosting.ConventionBasedStartup.Configure(IApplicationBuilder app)
   at Microsoft.AspNetCore.Mvc.Internal.MiddlewareFilterBuilderStartupFilter.<>c__DisplayClass0_0.<Configure>g__MiddlewareFilterBuilder|0(IApplicationBuilder builder)
   at Microsoft.AspNetCore.Server.IISIntegration.IISSetupFilter.<>c__DisplayClass4_0.<Configure>b__0(IApplicationBuilder app)
   at Microsoft.AspNetCore.HostFilteringStartupFilter.<>c__DisplayClass0_0.<Configure>b__0(IApplicationBuilder app)
   at Microsoft.AspNetCore.Hosting.Internal.AutoRequestServicesStartupFilter.<>c__DisplayClass0_0.<Configure>b__0(IApplicationBuilder builder)
   at Microsoft.AspNetCore.Hosting.Internal.WebHost.BuildApplication()
ClientConnectionId:00000000-0000-0000-0000-000000000000
Error Number:1231,State:0,Class:20
Hosting environment: Production
Content root path: D:\Sites\App-Test
Now listening on: http://127.0.0.1:36894
Application started. Press Ctrl+C to shut down.

When this happens the application stays unreachable, forever, until you manually restart the site + app pool.

Every time you try to access the app you get:

An error occurred while starting the application.
.NET Core 4.6.27414.05 X64 v4.0.0.0    |   Microsoft.AspNetCore.Hosting version 2.2.0-rtm-35687    |    Microsoft Windows 10.0.14393    |   Need help?

But the log files don't record anything, nor the event viewer.

To Reproduce

Steps to reproduce the behavior:

  1. Using this version of ASP.NET Core 2.2
  2. Create a web application and have it throw an exception in the Configure method, but only the very first time the app starts up (maybe check if a file exists, if it doesn't create it and throw an exception).
  3. Deploy the app to IIS and start it
  4. Browse the first time to the site, an exception will be logged
  5. The consecutive visits won't throw any errors but the site will still be down

Expected behavior

When the application fails to initialize it should retry to initialize every time it's requested.

analogrelay commented 5 years ago

Is the error page still rendered? We intentionally avoid re-running Startup in this case until the app is restarted. If you want us to continue attempting to run startup, you can disable capturing startup errors by calling .CaptureStartupErrors(false) on your WebHostBuilder.

rogerfar commented 5 years ago

I used .CaptureStartupErrors(false) and this morning the site was up this morning, so that seems to have solved it.

The error page was just the standard error: An error occurred while starting the application. .NET Core 4.6.27414.05 X64 v4.0.0.0 | Microsoft.AspNetCore.Hosting version 2.2.0-rtm-35687 | Microsoft Windows 10.0.14393 | Need help?

I realized that my WebHostBuilder is wrapped in a try catch:

            try
            {
                Log.Information("Starting host");

                WebHost.CreateDefaultBuilder(args)
                       .UseStartup<Startup>()
                       .ConfigureAppConfiguration((hostingContext, config) =>
                       {
                           config.SetBasePath(Directory.GetCurrentDirectory());
                           config.AddJsonFile("appsettings.json", false, false);
                       })
                       .CaptureStartupErrors(false)
                       .UseIISIntegration()
                       .UseSerilog()
                       .Build()
                       .Run();
            }
            catch (Exception ex)
            {
                Log.Fatal(ex, "Host terminated unexpectedly");
            }
            finally
            {
                Log.CloseAndFlush();
            }

Where Log is the Serilog logger, but is supposed to log the database, which is unavailable..

How does CaptureStartupErrors behave with the Application Initialization feature in IIS? I'm just wondering what the attempt to start interval would be.

analogrelay commented 5 years ago

CaptureStartupErrors stops the process from crashing when there's an Exception thrown during the execution of the methods in your Startup method, and instead renders a simple error page with the error message (in development mode).

As a result, when CaptureStartupErrors is on, the flow is something like this:

  1. A request comes in from a client
  2. IIS starts the app
  3. Startup throws
  4. ASP.NET Core catches the exception
  5. ASP.NET Core starts a simple server to render a 500 error page with the error
  6. IIS gets the 500

If the client makes another request, it goes through to the server ASP.NET Core set up to render the error page, so IIS doesn't think anything is wrong.

When CaptureStartupErrors is off, the flow is like this:

  1. A request comes in from a client
  2. IIS starts the app
  3. Startup throws
  4. ASP.NET Core does not catch the exception, so the process crashes
  5. IIS returns a 500

Now, if the client makes another request, the underlying ASP.NET Core app isn't running so IIS starts it up again and keeps hitting the error.

Application Initialization (as far as I know, I'm not an IIS expert ;)) just configures IIS to make a request back to itself (at the path you specify) to "prime" your application before letting clients access the app. So, when CaptureStartupErrors is on, the IIS Application Initialization feature seems to get in a loop constantly thinking the app is failing to start (because the captured startup error just keeps causing a 500).

If you're using Application Initialization, I'd suggest one of two paths forward:

  1. Disable CaptureStartupErrors. As long as you're monitoring logs and stack traces from your application, you'll get the startup errors through that so you don't really need this feature anyway

  2. Add retry logic for expected or transient failures during startup (like database access)

I'm going to close this issue because there isn't anything we plan to fix here, but feel free to ask further questions if you need clarification!

rogerfar commented 5 years ago

Thanks @anurse for the write up!

Just as note for who sees this in the future, I'm using Application Initialization` due to hangfire tasks that run in the background.

I'm going to add a self-check mechanism that attempts restarting the app.

Thanks!

brendonparker commented 4 years ago

This was helpful. However, even after I added .CaptureStartupErrors(false) my app was still getting permanently stuck with a HTTP Error 500.30 - ANCM In-Process Start Failure page.

In my case there are some global config files that our app references that aren't there in the image when its starting up, but are put in place further on in the server provisioning process.

My solution was just to recycle the app pool once the script runs that puts those file in place:

Restart-WebAppPool APIPool -ErrorAction SilentlyContinue

That seems to have done the trick.