nreco / logging

Generic file logger for .NET Core (FileLoggerProvider) with minimal dependencies
MIT License
296 stars 60 forks source link

App deadlock on FileLoggerProvider.WriteEntry #57

Closed JiriZidek closed 1 year ago

JiriZidek commented 1 year ago

We use the file logger in ASP.NET Core .net7 app the simple way by adding in Startup loggingBuilder.AddFile(loggingSection);

We have now face random (hours...week) situation when app gets "stuck". By taking a dump we found that all threads wait in this place - entryQueue.Add(message); in FileLoggerProvider

        internal void WriteEntry(string message)
        {
            if (!entryQueue.IsAddingCompleted)
            {
                try
                {
                    entryQueue.Add(message);
                }
                catch (InvalidOperationException)
                {
                }
            }
        }

image (the 2and 5 threads wait there as well...) But there is no clear indication "why" ? Of course the only way to get over this is to kill app and restart.

We noticed that the FileLoggerProvider is using a blocking collection

private readonly BlockingCollection<string> entryQueue = new BlockingCollection<string>(1024); 

And noticed that there is no "defensive measure" so once anything goes south here (or the loop simply terminates...

        private void ProcessQueue()
        {
            foreach (string item in entryQueue.GetConsumingEnumerable())
            {
                fWriter.WriteMessage(item, entryQueue.Count == 0);
            }
        }

We very likely end up with filled blocking queue, nobody taking items, app stuck....

We noticed that in methods WriteMessage(string message, bool flush) CheckForNewLogFile()

There are no try-catch so it is a matter of time, when something bad happens...

Do you consider it relevant ? Would you mind to add some corrections by some PR/Fork from us? Or you find it so easy to make it by yourself ?

We'd: a) Allow to configure an optional timeout, and changeentryQueue.Add(message);to entryQueue.TryAdd(message,timeout); b) Add some try-catch around manipulation with LogFileWriter

For us losing some part of log is much acceptable then having app deadlocked.

BR Jiri

VitaliyMF commented 1 year ago

We very likely end up with filled blocking queue, nobody taking items, app stuck....

I agree that most likely the reason is full queue because of exception that occurs in ProcessQueue method. It is important to keep app working even if something goes wrong with file logging, so it is definitely should be try-catch here -- if there are no error handler, at least FileLoggerProvider should not block WriteEntry and simply de-queue items (without any processing since write to file fails). Also, catch block can try to use HandleFileError handler if present - in that case app's code can at least handle file write error somehow (and maybe propose a new log file name).

Would you mind to add some corrections by some PR/Fork from us? Or you find it so easy to make it by yourself ?

The root of the problem seems clear, I'll try to propose a bugfix quickly by myself.

We'd: a) Allow to configure an optional timeout, and changeentryQueue.Add(message);to entryQueue.TryAdd(message,timeout); b) Add some try-catch around manipulation with LogFileWriter

I think (a) is not needed if ProcessQueue will guarantee dequeue even if exception occurs in fWriter.WriteMessage.

JiriZidek commented 1 year ago

Great ! Looking forward for update of nuget.

VitaliyMF commented 1 year ago

@JiriZidek I was able to reproduce a deadlock in the unit test, and provided a bugfix - now ProcessQueue behaves in this way:

1) if WriteMessage fails and app's code doesn't propose a new (alternative) file log name via HandleFileError - file logger simply stops processing of log entries, app continue working 2) if HandleFileError proposes a new file name, it tries to use it, and if it also fails file logger also stops processing of log entries (HandleFileError will not be called recursively)

With these changes an unit test cannot reproduce a deadlock anymore.

This is shipped in 1.1.7, you may already try it.

JiriZidek commented 1 year ago

: HandleFileError proposes a new file name

Is it supposed to be full path or just a file name and the path is reused from configuration ?

VitaliyMF commented 1 year ago

Is it supposed to be full path or just a file name and the path is reused from configuration ?

This should be a new filename. In the case of relative path (or just filename) app's working directory is used to resolve a full path.

JiriZidek commented 1 year ago

Just info: The modifications you made are too weak - we decided to reimplement the file handling part in more defensive manner... if you'll be interested, we can make it a fork.

Full crash... within IIS:

Exit on error: System.IO.IOException: The process cannot access the file 'D:\Logs\ReactClient_old.log' because it is being used by another process. at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options) at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable1 unixCreateMode) at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable1 unixCreateMode) at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable1 unixCreateMode) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access) **at NReco.Logging.File.FileLoggerProvider.FileWriter.createLogFileStream(Boolean append) at NReco.Logging.File.FileLoggerProvider.FileWriter.OpenFile(Boolean append) at NReco.Logging.File.FileLoggerProvider.FileWriter..ctor(FileLoggerProvider fileLogPrv) at NReco.Logging.File.FileLoggerProvider..ctor(String fileName, FileLoggerOptions options)** at Microsoft.Extensions.Logging.FileLoggerExtensions.CreateFromConfiguration(IConfiguration configuration, Action1 configure) at Microsoft.Extensions.Logging.FileLoggerExtensions.AddFile(ILoggingBuilder builder, IConfiguration configuration, Action1 configure) at ReactClient.Startup.<ConfigureServices>b__6_1(ILoggingBuilder loggingBuilder) in /home/myagent/_work/1/s/app/ReactClient/Startup.cs:line 67 at Microsoft.Extensions.DependencyInjection.LoggingServiceCollectionExtensions.AddLogging(IServiceCollection services, Action1 configure) at ReactClient.Startup.ConfigureServices(IServiceCollection services) in /home/myagent/_work/1/s/app/ReactClient/Startup.cs:line 65 at System.RuntimeMethodHandle.InvokeMethod(Object target, Void* arguments, Signature sig, Boolean isConstructor) at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr args, BindingFlags invokeAttr) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at Microsoft.AspNetCore.Hosting.ConfigureServicesBuilder.InvokeCore(Object instance, IServiceCollection services) at Microsoft.AspNetCore.Hosting.GenericWebHostBuilder.UseStartup(Type startupType, HostBuilderContext context, IServiceCollection services, Object instance) at Microsoft.AspNetCore.Hosting.GenericWebHostBuilder.<>c__DisplayClass14_0.b__0(HostBuilderContext context, IServiceCollection services) at Microsoft.Extensions.Hosting.HostBuilder.InitializeServiceProvider() at Microsoft.Extensions.Hosting.HostBuilder.Build() at ReactClient.Program.Main(String[] args) in /home/myagent/_work/1/s/app/ReactClient/Program.cs:line 17

VitaliyMF commented 1 year ago

Just info: The modifications you made are too weak - we decided to reimplement the file handling part in more defensive manner...

Please explain what do you mean with 'too weak' and we can discuss what else should be modified to make FileLogger more stable in case of file-related errors.

Regarding (updated)

Full crash... within IIS:

FileWriter.OpenFile (it is present in the stack trace) code is:

            void OpenFile(bool append) {
                try {
                    createLogFileStream(append);
                } catch (Exception ex) {
                    if (FileLogPrv.HandleFileError!=null) {
                        var fileErr = new FileError(LogFileName, ex);
                        FileLogPrv.HandleFileError(fileErr);
                        if (fileErr.NewLogFileName!=null) {
                            UseNewLogFile(fileErr.NewLogFileName);
                        }
                    } else {
                        throw; // do not handle by default to preserve backward compatibility
                    }
                }
            }

It uses "HandleFileError" when it is defined, however at the time of FileLoggerProvider creation (when log file is opened for the first time) this handler is not initialized yet. To fix this it would be enough to postpone log file open and open it only after initialization.

VitaliyMF commented 1 year ago

@JiriZidek After reviewing code I found that my suggestion was wrong and it looks like file open exception is raised in your code because HandleFileError is not defined at all. In that case this is normal behavior - what else file logger should do if it cannot open log file AND app's custom file errors handler is not defined?..

If this is not the case please provide more details about the issue because it's hard to assume what is wrong simply by the stacktrace.

JiriZidek commented 1 year ago

We'll send you fork info after it is done.

VitaliyMF commented 1 year ago

@JiriZidek it is still unclear what is wrong with the current code. I wasn't able to reproduce any file I/O-related issues that may affect the app when HandleFileError is specified.

My policy in accepting PRs is "Occam's razor": this should be minimal changes to fix something or add a new capability.

JiriZidek commented 1 year ago

Code is worth thousand words... we'll let you know.

AoooR42 commented 11 months ago

Hello @JiriZidek , is there anything new? Were you able to submit your new code? I'm still having the same problem with the latest version of the module. In the meantime I'm using the fileLoggerOpts.HandleFileError solution, but I'm not a fan of it because it creates several files. I imagine there must be a real solution that lets you write to a file that's already open elsewhere, since Visual Studio Code, for example, is capable of opening and modifying a .log file that's already open in another application.

VitaliyMF commented 11 months ago

@AoooR42

I imagine there must be a real solution that lets you write to a file that's already open elsewhere, since Visual Studio Code, for example, is capable of opening and modifying a .log file that's already open in another application.

I'm afraid you have wrong expectations here, because what you mentioned (opening the same file for write from different processes) is about OS file locks and sharing. This is conceptual: you simply cannot write consistently into the same file in parallel (and this is not related to NReco.File.Logging specific at all).

As an author of NReco.Logging.File I don't see any conceptual issues with the existing implementation, all file-related errors may be handled (via "HandleFileError") according to the logic of your application.

AoooR42 commented 11 months ago

I'm sorry, but I don't understand why. You said "This is conceptual: you simply cannot write consistently into the same file in parallel", sorry, but I don't see why it is impossible. If a process writes to a file, it seems logic that other processes can be alerted that there is new content in the file, so if those other processes want also to write to the file, then will write next to the new content (and then every process will again be alerted that there is new content in the file.) It seems to work like this for every software that can open and write to a file : you can use multiple software in parallel, and each of them will "know" the update made from other softwares. I tried to understand your code, it seems the opening of the log file is here: LogFileStream = new FileStream(LogFileName, FileMode.OpenOrCreate, FileAccess.Write); (https://github.com/nreco/logging/blob/4b85805f211d33c9af33a2f8690d3212ddbebaf3/src/NReco.Logging.File/FileLoggerProvider.cs#L219C90-L219C90) It seems that FileStream can be initialized with FileShare.Read or FileShare.ReadWrite, maybe it can solve the issue ? Sources: https://stackoverflow.com/a/63367423/16233618 https://learn.microsoft.com/fr-fr/dotnet/api/system.io.filestream?view=net-8.0#constructors If you can't write in a file used by another process, maybe there is a more global problem with your implementation that the @JiriZidek 's code can solve? 🤔 It seems unlikely that it's absolutely impossible in C# to write to a file that's already open elsewhere 🤔

VitaliyMF commented 11 months ago

You said "This is conceptual: you simply cannot write consistently into the same file in parallel", sorry, but I don't see why it is impossible.

Let's assume that you have 2 processes that write logs to the same file. Writing to a file is not transactional; when app #1 writes a log message to a file, and another app #2 do the same exactly at that time, you easily can get a garbage in your log file.

If you think that specifying FileShare.ReadWrite can be useful for you it is not a problem to add one more option to control FileShare mode -- please check that this really works fine for your purposes.

It seems unlikely that it's absolutely impossible in C# to write to a file that's already open elsewhere

Without special synchronization mechanism writing to the same file from several processes simultaneously is a bad idea. When your .NET apps keeps a log file opened, you can open this file (for "Read"), no problems with that. However, if you want to write to this log file (that is already in use), this has a little sense from my point of view.

JiriZidek commented 11 months ago

Hello @JiriZidek , is there anything new? Were you able to submit your new code?

YES. Curently we use this code. Note we reduced the platforms to .net7 only since we currently need only this. That's why I did not try to do PR yet. Now we test it on few production deployments around 3 weeks and see no problems yet. NReco.Logging.File.zip

JiriZidek commented 11 months ago

If you can't write in a file used by another process, maybe there is a more global problem with your implementation that the @JiriZidek 's code can solve? 🤔 It seems unlikely that it's absolutely impossible in C# to write to a file that's already open elsewhere 🤔

I'm afraid we do not need to WRITE to SAME file from multiple processes. We need to READ the file while its being written by the app, and our code satisifies this.