collectd / collectd

The system statistics collection daemon. Please send Pull Requests here!
http://collectd.org
Other
3.02k stars 1.23k forks source link

Collectd SNMP CSV file Update Delay #4089

Closed bacchus21 closed 5 months ago

bacchus21 commented 1 year ago

These are the continuous findings on my previous posting, https://github.com/collectd/collectd/issues/4088

Expected behavior

Setting on collectd.conf /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// <Host "XXXXGW1"> Address "10.XX.XX.1" Version 2 Community "XYZ111111" Collect "cisco_cpu" "ifmib_if_octets32" "ifmib_if_octets64" "ifmib_if_packets32" "ifmib_if_packets64" "ifmib_if_errors32" "ifmib_if_errors64" "uptime" "cisco_memory_used" "std_traffic" "allocated_disk" "HW_SW_cpu_Used0" "HW_SW_cpu_Used1" Interval 10 Timeout 10 Retries 1

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Actual behavior

Found out that the issues are on the csv file update frequency delay.

/////////////////////////////////////////////////////////////////////////////////////////
[root@collectdsvr001 snmp]# cat if_octets-GigabitEthernet0_0_1-2023-02-09
epoch,ifInOctets,ifOutOctets
1675932575.670,4060126252,2049472838
Thursday, 09 February 2023 **17:49:35.670** +09:00

1675932590.553,4060897476,2049669860
Thursday, 09 February 2023 **17:49:50.553** +09:00
///////////////////////////////////////////////////////////////////////////////////////

Timestamps in the csv have been updated in 10-15 secs interval. But the issue is the csv file is being updated around 3-4 mins interval. Not sure why.

[root@collectdsvr001 snmp]# clock
Thu 09 Feb 2023 05:34:10 PM JST  -0.776807 seconds
[root@collectdsvr001 snmp]#
[root@collectdsvr001 snmp]# ls -al if_octets-GigabitEthernet0_0_1-2023-02-09
-rw-r--r-- 1 root root 18013 Feb  9 **17:32** if_octets-GigabitEthernet0_0_1-2023-02-09
[root@collectdsvr001 snmp]#
[root@collectdsvr001 snmp]# ls -al if_octets-GigabitEthernet0_0_1-2023-02-09
-rw-r--r-- 1 root root 18013 Feb  9 **17:32** if_octets-GigabitEthernet0_0_1-2023-02-09
[root@collectdsvr001 snmp]#
[root@collectdsvr001 snmp]# clock
Thu 09 Feb 2023 05:34:38 PM JST  -0.778793 seconds
[root@collectdsvr001 snmp]#
[root@collectdsvr001 snmp]# clock
Thu 09 Feb 2023 **05:37:22** PM JST  -0.929790 seconds
[root@collectdsvr001 snmp]#
[root@collectdsvr001 snmp]# ls -al if_octets-GigabitEthernet0_0_1-2023-02-09
-rw-r--r-- 1 root root 18087 Feb  9 **17:36** if_octets-GigabitEthernet0_0_1-2023-02-09
[root@collectdsvr001 snmp]#

Does anyone know why this is happening or how to reduce the csv file update frequency? Any comments or advice will be highly appreciated.

Steps to reproduce

bacchus21 commented 1 year ago

Does anyone have clues why this message shows?

Feb 26 19:05:23 collectdsvr001 collectd[25195]: plugin_read_thread: read-function of the snmp-XXXXGW1' plugin took 17.866 seconds, which is above its read interval (10.000 seconds). You might want to adjust theInterval' or `ReadThreads' settings.

@pyr , @octo , @tokkee, @mrunge , @jkohen , @sunkuranganath , @kwiatrox , @bkjg , Please advise. Thank you in advance.

eero-t commented 1 year ago

From: https://github.com/collectd/collectd/blob/main/src/collectd-snmp.pod

Because querying a host via SNMP may produce a timeout the "complex reads" polling method is used. The ReadThreads parameter in the main configuration influences the number of parallel polling jobs which can be undertaken. If you expect timeouts or some polling to take a long time, you should increase this parameter. Note that other plugins also use the same threads.

Some of your hosts are slow to respond.

(If e.g. network connection is not working, there could be 30s timeout.)

octo commented 5 months ago

@eero-t 's answer is probably correct. If hosts take more than 10s to query, you have to decrease the query frequency. If slow targets starve the thread pool, add more threads.