MicrosoftDocs / sysinternals

Content for sysinternals.com
http://sysinternals.com
Creative Commons Attribution 4.0 International
476 stars 259 forks source link

Sysmon throwing many SysmonError drop events with EventID of "QUEUE" #813

Open branchnetconsulting opened 4 months ago

branchnetconsulting commented 4 months ago

Across many client sites and a variety of different Windows versions where we have Sysmon 15.14 running, we are seeing diverse Sysmon error events (255) with this description: Events dropped from driver queue... This has been happening on dozens of Windows systems, both end user and server system, including at least Windows 11 and Server 2019, perhaps others as well. We have been using Sysmon broadly for years and previous to upgrading to 15.14 we do not recall ever seeing "Events dropped" complaints from Sysmon. Frequently individual systems make multi-hour sustained bursts of these error events exceeding 100 per second, and we are concerned that system performance may be getting degraded by the issue. During such a sustained burst of errors, the Windows system involved is seen to be producing other normal Sysmon events at normal low-volume levels.

We use a lightly customized version of the "balanced" config from the Sysmon Modular project.
Any advise about how to diagnose or treat this issue further would be sincerely appreciated.

Kevin Branch

wzr commented 2 months ago

Same here. On v15.12. It seems to happen mostly on idle machines, without any user logged in. A high numbers of hosts with this error are recently installed laptops that stay plugged in the network until the users pick them up without any human interaction.

foxmsft commented 2 months ago

v15.1 is the first version that prints that message. That error is an FYI, before that it was silently ignored.

wzr commented 2 months ago

The immediate challenge is that it is very chatty. We get upwards of 20M events/day for machines with the issue (~20-30 out of ~6600). In some cases up to 200M events/day. We can adjust the WEC XPath to ignore all EventID 255 in the subscription, but that does create a blind spot for actual Sysmon issues, and it still adds a lot of backpressure in the Sysmon Eventlog.

foxmsft commented 2 months ago

My initial assumption was that the default event queue size would be "enough for any number of events". I'm not that concerned about it being chatty, as that can be tuned.

I'm more interested in why it's dropping the events, and avoiding it altogether. Can you please send me an email at hotmail? I want to follow up.

branchnetconsulting commented 2 months ago

Our concern is that the massive volume we are seeing of these errors on any given host does not appear to correlate with an abnormally high volume of common Sysmon events on the same host. It's almost like once triggered, Sysmon reporting of this warning message starts thrashing even when the system is not producing enough events to reasonably stress the event queue. Either that or somehow the queue is getting into a broken state where even low levels of incoming messages still get blocked. I would be happy to give further feedback on this. My email is kevin@branchnetconsulting.com.

wzr commented 2 months ago

In my case, it seems that ProcessAccess and Registry Event are the busy (I assume) queues: These are numbers across a sample of 100 computers, for just 24 hours

Description count
Events dropped from driver queue: ProcessAccess:1 837351817
Events dropped from driver queue: ProcessAccess:2 225073628
Events dropped from driver queue: ProcessAccess:3 50324386
Events dropped from driver queue: ProcessAccess:4 17270094
Events dropped from driver queue: RegistryEvent:1 15456089
Events dropped from driver queue: ProcessAccess:5 8366108
Events dropped from driver queue: ProcessAccess:1 RegistryEvent:1 7699987
Events dropped from driver queue: ProcessAccess:6 5328869
Events dropped from driver queue: ProcessAccess:7 3862218
Events dropped from driver queue: ProcessAccess:8 2918265
Events dropped from driver queue: ProcessAccess:9 2355140
Events dropped from driver queue: ProcessAccess:2 RegistryEvent:1 2071065
Events dropped from driver queue: ProcessAccess:10 1968974
Events dropped from driver queue: ProcessAccess:11 1529965
Events dropped from driver queue: ProcessAccess:12 1249597
Events dropped from driver queue: ProcessAccess:13 1033853
Events dropped from driver queue: RegistryEvent:2 957377

[...]

wzr commented 2 months ago

@foxmsft I did email at your github username at hotmail, did it ever reach you?

My initial assumption was that the default event queue size would be "enough for any number of events". I'm not that concerned about it being chatty, as that can be tuned.

I'm more interested in why it's dropping the events, and avoiding it altogether. Can you please send me an email at hotmail? I want to follow up.