ConSol-Monitoring / snclient

SNClient+ - Cross platform monitoring agent
MIT License
48 stars 9 forks source link

SNClient+ v0.13 (Build: 34e213b) - wmi query failed: wmi: CallMethod ConnectServer failed: Exception occurred. (Not enough storage is available to process this command.) #65

Closed andreaslisondehn closed 8 months ago

andreaslisondehn commented 9 months ago

When executing the checks check_eventlog and check_pagefile, the error message appears after some time:

UNKNOWN - wmi query failed: wmi: CallMethod ConnectServer failed: Exception occurred. (Not enough storage is available to process this command. )

the checks are executed as follows:

check_eventlog -a "filter=source='disk' and level='error'" check_eventlog -a "filter=source='Resource-Exhaustion-Detector' and id='2004'" check_pagefile -a "filter=name='total'" "warn=free<5%" "crit=free<5%"

and the SNClient uses round about 1,2GB Memory

SNCLient runs on a Windows Server 2016 Standard with 4GB Memory

after a few hours the SNClient crashes (depending on the available memory)

sni commented 9 months ago

Could you try the latest nightly, i worked on those 2 checks recently. Besides that, i'll have a look at the memory usage. It might be related to the wmi queries.

andreaslisondehn commented 8 months ago

thanks, I've installed version SNClient+ v0.13.0079 (Build: 15e6b30) the agent no longer crashes, but after a few minutes it gets the following errors:

Service checks like (check_service "filter=name in ('BvSshServer')") get the following error: Failed to fetch service list: Not enough storage is available to complete this operation

Event log checks like (check_eventlog "filter=source='disk' and level='error'") and pagefile check (check_pagefile "filter=name='total'" "warn=free<5%" "crit=free<5%") get the following error: UNKNOWN - wmi query failed: wmi: ole.CoInitialize failed: Incorrect function.

check_memory "filter=type='physical'" "warn=free<1%" "crit=free<1%" get the following error: CHECK_NRPE: Receive header underflow - only 0 bytes received (4 expected).

the log file may provide further information snclient.log

btw: the memory usage of ~1,2GB is unchanged

sni commented 8 months ago

that's odd, i started a 2016 server vm and the agent uses about 30-40mb ram (which increases of time, but that's anther story). Is this virtual size or resident size? Because the virtual memory usage might indeed be higher but should not affect normal operations.

andreaslisondehn commented 8 months ago

the server runs in a VM

Approx. 10 minutes after starting the SNClient, the SNClient requires ~1.2GB memory (according to Win TaskManager) and the wmi checks (check_eventlog, check_pagefile) run on error:

UNKNOWN - wmi query failed: wmi: CallMethod ConnectServer failed: Exception occurred. (Not enough storage is available to process this command. )

so I installed the latest version (amd64 - SNClient+ v0.13.0080 (Build: d7efae4)) After ~30 minutes the checks were still running without a problem, but the memory consumption grew to over 2.1GB(!)

Update: by specifying the scan range in the check_eventlog, the memory usage is kept within limits (after 40 minutes ~270MB) and the checks do not get an error. I will observe the behavior until tomorrow

andreaslisondehn commented 8 months ago

After running for around 8.5 hours, the client crashes with the following error messages:

[2023-12-12 22:20:45.934][Error][pid:4584][snclient:1035] system error: %!w(errors.errorString=&{proc: fork/exec argus\managed\Resources_bin\check_logfiles.exe: The paging file is too small for this operation to complete.}) [2023-12-12 22:26:52.225][Error][pid:4584][snclient:1035] system error: %!w(errors.errorString=&{proc: fork/exec C:\Windows\system32\cmd.exe: The paging file is too small for this operation to complete.}) [2023-12-12 22:27:44.733][Error][pid:4584][snclient:1035] system error: %!w(errors.errorString=&{proc: fork/exec argus\managed\Resources_bin\check_logfiles.exe: The paging file is too small for this operation to complete.}) [2023-12-12 22:30:48.639][Error][pid:4584][snclient:1035] system error: %!w(errors.errorString=&{proc: fork/exec argus\managed\Resources_bin\check_logfiles.exe: The paging file is too small for this operation to complete.}) [2023-12-12 22:30:58.256][Error][pid:4584][snclient:1035] system error: %!w(*errors.errorString=&{proc: fork/exec C:\Windows\system32\cmd.exe: The paging file is too small for this operation to complete.})

next try with v0.14

after a few minutes there are the following erros in the log file:

[2023-12-13 08:12:47.683][Info][pid:3092][listener:226] starting nrpe listener on :5666 [2023-12-13 08:15:16.608][Warn][pid:3092][check_eventlog_windows:108] eventlog query failed, file: Windows PowerShell: could not fetch events from subscription: The handle is invalid. [2023-12-13 08:45:15.672][Warn][pid:3092][check_eventlog_windows:108] eventlog query failed, file: System: could not fetch events from subscription: The handle is invalid. [2023-12-13 10:02:17.186][Warn][pid:3092][check_eventlog_windows:108] eventlog query failed, file: Windows PowerShell: could not fetch events from subscription: The handle is invalid. [2023-12-13 10:03:14.878][Warn][pid:3092][check_eventlog_windows:108] eventlog query failed, file: Application: could not fetch events from subscription: The handle is invalid.

and after about 2 hours the memory consumption grew to ~2GB(!)

Memory

sni commented 8 months ago

Could you try the latest release, there had been memory issues when doing wmi or eventlog queries.

andreaslisondehn commented 8 months ago

Hi,

version v0.15 (Build: a3b9de8) has been running since 2023-12-18 09:17 without any problems, the problem with memory consumption no longer occurs. the client currently requires approx. 40 - 50MB. thank you very much