serilog-mssql / serilog-sinks-mssqlserver

A Serilog sink that writes events to Microsoft SQL Server and Azure SQL
Apache License 2.0
278 stars 148 forks source link

Timeout interaction with clean shutdown #465

Closed Timovzl closed 1 year ago

Timovzl commented 1 year ago

The timeout feature offers a wonderful trade-off between performance and the risk of losing data in the case of failures. Ideally, we should never lose data in the absence of failures, and I'm wondering if this is already the case.

Let's say some log entries are queued to be saved, and the timeout is running with 3 seconds remaining. Suddenly, clean shutdown is initiated, and all other components of the system take 1 second before allowing the application to terminate. The log timeout would have 2 seconds remaining at that point.

Do the log entries get saved or not?

If the log entries are not getting saved, then I believe we can improve on this!

A quick workaround is to register an IHostedService that asynchronously delays for [whatever your maximum timeout value is] on StopAsync. This works, but will generally delay more than is necessary.

Ideally, the package would solve this out-of-the-box, by registering its own IHostedService. That service could use the same trick, but it could have knowledge of the remaining timeout and whether any log entries are pending at all. This way, it could wait only as long as is beneficial.

Be careful with such an implementation, however: After StopAsync is called for the described IHostedService, it may be called for other IHostedService instances, and there may also be request handlers that are still completing their work. Any of these may result in new log entries being created. A solution I would recommend to catch this edge case, is to disable batched logging as soon as shutdown is initiated. After disabling the feature, we can wait for (or immediately save) any queued log entries that were waiting for an ongoing timeout.

ckadluba commented 1 year ago

If you use Serilog correctly no log entry gets lost in case of a normal process exit or if it receives a TERM signal. You just have to make sure to call Log.CloseAndFlush() on process exit and everything should be fine (https://github.com/serilog-mssql/serilog-sinks-mssqlserver#always-call-logcloseandflush).

Should you have noticed any scenario where log entries would be lost, please post some details ideally with a small sample program so we can analyze the case and determine if there is a problem in your code or a bug in Serilog.

Best regards, Christian

Timovzl commented 1 year ago

@ckadluba Thanks for the info!

I am asking specifically in the case of modern .NET Core/unified applications, where the logical application is run in an IHost. Here, it seems highly counterintuitive to manually call a shutdown command for a class library, such as Log.CloseAndFlush() for Serilog. Is this something we should call manually? I hope Serilog takes care of this.

(Class libraries normally handle their own shutdown logic, by registering an IHostedService.)

jonorossi commented 1 year ago

This is a Serilog design decision, not the SQL Server sink specifically, see https://github.com/serilog/serilog/wiki/Lifecycle-of-Loggers.

Serilog is used in all sorts of .NET applications including command line or applications without an IHost. It is often also used before the bootstrapping of an application so that the host can log, for example Serilog.Extensions.Logging.

Timovzl commented 1 year ago

I have requested clarification on this situation in the Serilog repo.