influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.63k stars 5.58k forks source link

Add --watch-config option to Windows Service #14696

Closed simonwhybrow-cbre closed 7 months ago

simonwhybrow-cbre commented 8 months ago

Use Case

The ability for the telegraf agent to be installed as a service and have the --watch-config option be applied so that if config files change the service will pick up the changes.

Expected behavior

The telegraf service should watch for config changes and apply them when changes are made as can be seen when run directly on the command line.

Actual behavior

The --watch-config option works as expected on the command line, but when installing as a Windows Service the option is not applied as expected. Nothing is logged to show that the config watcher has started.

Additional info

No response

powersj commented 8 months ago

Nothing is logged to show that the config watcher has started.

There is a known issue with logging not showing up when running as a windows service. Are you seeing that the watch config also fails? Or just the absence of messages?

Can you please provide whatever logs you do have and see, and demonstrate updating a log file?

Thanks

simonwhybrow-cbre commented 8 months ago

So I can install the server with the --watch-config flag set to either notify or poll as below:

telegraf.exe --service install --watch-config poll --config \\emea\data\Shares\UK\IT\EMEA_WindowsInfra_Telegraf_Configs\arcgis\telegraf.conf --config-directory \\emea\data\Shares\UK\IT\EMEA_WindowsInfra_Telegraf_Configs\core --service-name telegraf

The service starts successfully, but as you state nothing is logged to say that the watcher is running.

I then change the config file specified under --config to monitor an additional windows service.

The original config file snippet for the win services looks like below

  [[inputs.win_services]]
    ## Names of the services to monitor. Leave empty to monitor all the available services on the host
    service_names = ["ArcGIS Notebook Server"]

and I then update it to below:

  [[inputs.win_services]]
    ## Names of the services to monitor. Leave empty to monitor all the available services on the host
    service_names = ["ArcGIS Notebook Server", "LanmanServer"]

Nothing is logged to show that telegraf has picked up this change and no additional metrics are collected for the new service. When checking in Prometheus at the servers metrics page, it still only shows the metrics for the original windows service that was being monitored:

# TYPE win_net_Packets_Sent_persec untyped
win_net_Packets_Sent_persec{host="GBRDCAGSP005",instance="vmxnet3 Ethernet Adapter",objectname="Network Interface"} 2.5812981834709166
# HELP win_services_startup_mode Telegraf collected metric
# TYPE win_services_startup_mode untyped
win_services_startup_mode{display_name="ArcGIS Notebook Server",host="GBRDCAGSP005",service_name="ArcGIS Notebook Server"} 2
# HELP win_services_state Telegraf collected metric
# TYPE win_services_state untyped
win_services_state{display_name="ArcGIS Notebook Server",host="GBRDCAGSP005",service_name="ArcGIS Notebook Server"} 4
# HELP win_swap_Percent_Usage Telegraf collected metric
powersj commented 8 months ago

I think this has something to do with the way the service handles CLI options different. I did not realize this till I was looking at https://github.com/influxdata/telegraf/issues/14144 which is the issue about logging. I need to look deeper, but it is entirely possible that the service path does not even take watching files into account.

simonwhybrow-cbre commented 7 months ago

Hi @powersj is there any update on this issue and a potential fix? Thanks

powersj commented 7 months ago

I have no update.

powersj commented 7 months ago

@simonwhybrow,

In 20-30mins there will be artifacts attached to https://github.com/influxdata/telegraf/pull/15040, can you please download a windows one and give it a try to see if the --watch-config parameter correctly gets passed on?

Thanks

simonwhybrow-cbre commented 7 months ago

@powersj Awesome, will keep an eye out. Thanks

simonwhybrow-cbre commented 7 months ago

@powersj Tested and that is working as expected. Changes are being picked up in config files when they are changed. Thank you for getting a fix in place.

powersj commented 7 months ago

Thank you very much for the patience and confirming!