influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.5k stars 5.55k forks source link

Expose trigger status for Windows services with on demand start #4408

Open lindhor opened 6 years ago

lindhor commented 6 years ago

Feature Request

Windows services may be configured with a trigger that stops inactive services and restart them on demand when requested. These services are shown in Windows as (Trigger Start) after the startup mode. One such example is the service DNS Client (Dnscache).

Proposal:

In the Telegraf plugin win_services it would be good to add trigger info as either

a tag with the name triggered on the existing startup_mode metric (preferred). or a new metric triggered with value 0 (no trigger) or 1 (has trigger) in addition to state and startup_mode. Include the same tags as on startup_mode (display_name and service_name)

See https://mikefrobbins.com/2015/12/24/use-powershell-to-determine-services-with-a-starttype-of-automatic-or-manual-with-trigger-start/ and https://docs.microsoft.com/en-us/windows/desktop/services/service-trigger-events

Current behavior:

Currently only metrics for state and startup_mode are returned, none shows info if the service is configured with trigger info.

Desired behavior:

A way to see if a service is configured as Manual (Trigger Start) or Automatic (Trigger Start) and distinguish that from "normal" Manual and Automatic startup mode.

Use case: [Why is this important (helps with prioritizing requests)]

When the metrics startup_mode and state are combined it is possible to alert on all services that should be started actually are started without having to list each service being monitored. This works well for services with startup mode Automatic. It also works for services in startup mode Automatic (Trigger Start) if they are not stopped by Windows awaiting on-demand start. Monitoring these services gives false alarms since they are in state stopped and startup mode automatic. It would be good to be able to filter these kind of services out (and possibly monitor them in other ways).

powersj commented 2 years ago

Hi,

Sorry, no one has not gotten back to you.

Is this something you are still interested in seeing?

Thanks!

lindhor commented 2 years ago

Thanks for picking this up! I still think it would be useful for the same reasons I initially described.

reimda commented 2 years ago

It looks like the automatic status of a service is available to query in the module telegraf uses (x/sys/windows, see StartAutomatic constant) I expect this won't take a lot of telegraf code to implement.

powersj commented 3 months ago

Hi,

I have been looking into this and I am not seeing anything that exposes trigger information with the Go Windows API in use by the Windows Service plugin. The current start type will return if it is at boot, system, manual (StartManual), automatic (StartAutomatic), disabled (StartDisabled), but as you point out this does not tell you if it was a trigger start or not.

The result from the service manager does not seem to provide anything related to trigger information either.

lindhor commented 3 months ago

It seems like it is possible to get trigger info via the win32 api https://learn.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-queryserviceconfig2a A corresponding python discussion that provides more background https://stackoverflow.com/questions/46916726/python-win32service-getting-triggered-startup-information-for-service It seems like the go module x/sys/windows should have some knowledge about this https://cs.opensource.google/go/x/sys/+/refs/tags/v0.20.0:windows/service.go;l=249 but I don’t know much go so not sure.

powersj commented 3 months ago

Here is the actual call of queryServiceConfig, which is unexported.

We would need to evaluate if we pull this in tree and make the unsafe call or not. Additionally, to complicate things further, the plugin uses interfaces to aid in testing, which of course prevents us from directly accessing the service handler needed to pass to that function.