luigiberrettini / NLog.Targets.Syslog

A Syslog server target for NLog
Other
71 stars 46 forks source link

Syslog logging just randomly stops #311

Open karlra opened 1 year ago

karlra commented 1 year ago

Ask the question We have a high-traffic system that uses this library to log to syslog.

However, at random times it just....stops. Nothing more gets logged to syslog even though I know for a fact that the methods get called. Can you give any sort of guidance as to what the problem might be and/or how to debug it?

Additional context

snakefoot commented 1 year ago

Think you need to capture a minidump from the application, and open that minidump in Visual Studio.

Then look at the thread callstacks using Visual Studio (maybe using Visual Studio Parallel Stacks), and determine why it is getting stuck.

When having pin-pointed the thread(s) causing the "stop", then you can paste the thread callstacks here for further investigation.

Have you tried upgrading to latest version?

karlra commented 1 year ago

I'll try that the next time it happens, but it's not easy because our systems have a lot of traffic and a lot of active threads. When I say stop, I don't mean that the application itself hangs up, because it keeps processing requests and there is no growth in memory usage either which would indicate queuing. Can you give me some kind of starting point at what I should be looking for in the minidump?

I am using these settings:

target.Enforcement.Throttling.Limit = 10000;
target.Enforcement.ReplaceInvalidCharacters = true;
target.Enforcement.Throttling.Strategy = NLog.Targets.Syslog.Settings.ThrottlingStrategy.Discard;
target.MessageCreation.Facility = NLog.Targets.Syslog.Settings.Facility.Local1;
target.MessageCreation.Rfc = NLog.Targets.Syslog.Settings.RfcNumber.Rfc5424;
target.MessageCreation.Rfc5424.DisableBom = true;

The behaviour suggest that it starts discarding messages but what could be causing that? I don't think there is anything wrong with our syslog because other sources (other apps on the same machine) keep writing successfully.

I'll try upgrading to the most recent version in the mean time.

karlra commented 1 year ago

It happened again and I dumped the process. There are no stuck threads, but there is a lot of NLog objects and I don't know why. Also there are over 10.000 NLog objects in a dictionary the LOH.

image

image

snakefoot commented 1 year ago

If you have a queue of pending NLog LogEvents that are not yet written, then they will ofcourse stay in memory.

The NLog MruCache is used for caching known object-types and their object-property-names.

You are using old NLog v4.5.4, which doesn't include this MruCache-Initial-Capacity-fix: https://github.com/NLog/NLog/pull/4021

I can only recommend that you upgrade to NLog v4.7.15 (Or even better NLog v5)

RyanGaudion commented 4 months ago

@snakefoot this has happened multiple times for us too. Most recently using Nlog 5.2.8 and NLog.Targets.Syslog 7.0.0.

Just like @karlra - logs just stop being sent via syslog, however are still written to other targets such as log files.

It seems to occur when a very large log tries to be written (normally if serializing a very large object into the log). As per #313 we've tried to truncate our logs - but that doesn't seem to work either.

snakefoot commented 4 months ago

@RyanGaudion I know about the NLog-core, but I don't know much about the SysLog-target or your environment. Without providing details about where threads are stuck in your application along with memory-usage-details, then it is hard to guess the cause of your problems.