Cacti / cacti

Cacti ™
http://www.cacti.net
GNU General Public License v2.0
1.61k stars 399 forks source link

data sources not in poller_item #3428

Closed sysres-dev closed 4 years ago

sysres-dev commented 4 years ago

Describe the bug

Poller cache is not filled with data sources for a given data query, thus result in rrd not being created and data not being polled. Also, related datasources appears to have their |query_*| vars not resolved in data source title while the same vars are fine for the associated graph. From what i have seen in the debug log, cacti seems to insert those data source in poller item table, and then delete them right after :

2020/04/02 10:36:44 - DBCALL DEVEL: SQL Exec: "UPDATE poller_item SET present=0 WHERE local_data_id IN (1104558, 1104559, 1104560, 1104561, 1104562, 1104563, 1104564, 1104565, 1104566, 1104567, 1104568, 1104569, 1104570, 1104571, 1104572, 1104573, 1104574, 1104575, 1104576, 1104577, 1104578, 1104579, 1104580, 1104581, 1104582, 1104583, 1104584, 1104585, 1104586, 1104587, 1104588, 1104589, 1104590, 1104591, 1104592, 1104593, 1104594, 1104595, 1104596, 1104597, 1104598, 1104599, 1104600, 1104601, 1104602, 1104603, 1104604, 1104605)"
2020/04/02 10:36:44 - DBCALL DEVEL: SQL Exec: "DELETE FROM poller_item WHERE present=0 AND local_data_id IN (1104558, 1104559, 1104560, 1104561, 1104562, 1104563, 1104564, 1104565, 1104566, 1104567, 1104568, 1104569, 1104570, 1104571, 1104572, 1104573, 1104574, 1104575, 1104576, 1104577, 1104578, 1104579, 1104580, 1104581, 1104582, 1104583, 1104584, 1104585, 1104586, 1104587, 1104588, 1104589, 1104590, 1104591, 1104592, 1104593, 1104594, 1104595, 1104596, 1104597, 1104598, 1104599, 1104600, 1104601, 1104602, 1104603, 1104604, 1104605)"

To Reproduce

I'm not sure how to reproduce it exactly, but from what i have observed, it could com from data template, data sources, the way they are generated and inserted into the poller item table.

I zipped and joined to the issed the xml files to import the data query, data template, graph template into database and also the data query xml file : xmls.zip

Also, i was able to reproduce the problem on an other instance of Cacti with same version (1.1.38) and operating system (Redhat 7.6), i observed the same behavior.

Expected behavior

Data sources Titles should display with snmp cached data and polling is expected to execute.

Screenshots

Broken data source name : image Fine graph name : image

Desktop (please complete the following information)

Additional context

Few things i have tried to workaround the problem :

The devices on which we are running the requests in this template are both huawei switches Cloud Engine (CE8861, CE6863).

TheWitness commented 4 years ago

This is normal if there are bad indexes found during the Re-index process. Do you have indexes that come and go and get disabled? Please advise. We have recently found that some indexes come and go and that we don't necessarily block those devices from collecting data.

We are looking at a business case to re-introduce old behavior that was making it difficult to track down bad data sources for cases where this behavior would indicate something wrong.

sysres-dev commented 4 years ago

Hello @TheWitness , Thank you for your answer. Indexes seems to be correctly detected by the data query, and from what i have seen, the related items in the "create graph for this device" page remains ticked and grayed out after a graph is created for a given graph template. It looks like usual behavior to me, but i may have mistunderstood the request, feel free to ask further informations if needed.

I dont know if it may be related, but in the data query xml file i'm using oid_index_parse with OID/REGEXP.

Here is the reindex verbose log :

Total: 0.000000, Delta: 0.000000, Running data query [97].
Total: 0.000000, Delta: 0.000000, Found type = '3' [SNMP Query].
Total: 0.010000, Delta: 0.000000, Found data query XML file at '/var/www/cacti-aao/resource/snmp_queries/huawei_sfp.xml'
Total: 0.010000, Delta: 0.000000, XML file parsed ok.
Total: 0.010000, Delta: 0.000000, <oid_num_indexes> missing in XML file, 'Index Count Changed' emulated by counting oid_index entries
Total: 0.350000, Delta: 0.350000, Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4' Index Count: 16
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850560' value: 'INJBL2230505'
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850561' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850562' value: 'INJBH2330304'
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850563' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850564' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850565' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850566' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850567' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850568' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850569' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850570' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850571' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850572' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850573' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850574' value: ''
Total: 0.350000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850575' value: 'INJBH2330595'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850560' results: '16850560'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850561' results: '16850561'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850562' results: '16850562'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850563' results: '16850563'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850564' results: '16850564'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850565' results: '16850565'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850566' results: '16850566'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850567' results: '16850567'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850568' results: '16850568'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850569' results: '16850569'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850570' results: '16850570'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850571' results: '16850571'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850572' results: '16850572'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850573' results: '16850573'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850574' results: '16850574'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850575' results: '16850575'
Total: 0.350000, Delta: 0.000000, Located input field 'hwEntityOpticalVendorSn' [get]
Total: 0.420000, Delta: 0.070000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850560' [value='INJBL2230505']
Total: 0.430000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850561' [value='']
Total: 0.480000, Delta: 0.050000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850562' [value='INJBH2330304']
Total: 0.490000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850563' [value='']
Total: 0.500000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850564' [value='']
Total: 0.510000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850565' [value='']
Total: 0.520000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850566' [value='']
Total: 0.530000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850567' [value='']
Total: 0.540000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850568' [value='']
Total: 0.550000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850569' [value='']
Total: 0.560000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850570' [value='']
Total: 0.570000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850571' [value='']
Total: 0.580000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850572' [value='']
Total: 0.590000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850573' [value='']
Total: 0.600000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850574' [value='']
Total: 0.630000, Delta: 0.030000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850575' [value='INJBH2330595']
Total: 0.630000, Delta: 0.000000, Located input field 'hwEntityBomEnDesc' [get]
Total: 0.630000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850560' [value='40GE-1310nm-LC-2000(9um/125um SMF)']
Total: 0.640000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850561' [value='']
Total: 0.650000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850562' [value='40GE-1310nm-LC-2000(9um/125um SMF)']
Total: 0.660000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850563' [value='']
Total: 0.670000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850564' [value='']
Total: 0.670000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850565' [value='']
Total: 0.680000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850566' [value='']
Total: 0.690000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850567' [value='']
Total: 0.700000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850568' [value='']
Total: 0.710000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850569' [value='']
Total: 0.710000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850570' [value='']
Total: 0.720000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850571' [value='']
Total: 0.730000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850572' [value='']
Total: 0.740000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850573' [value='']
Total: 0.750000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850574' [value='']
Total: 0.750000, Delta: 0.010000, Executing SNMP get for data @ '.1.3.6.1.4.1.2011.5.25.31.1.1.2.1.2.16850575' [value='40GE-1310nm-LC-2000(9um/125um SMF)']
Total: 0.770000, Delta: 0.020000, Update data query sort cache complete
Total: 0.770000, Delta: 0.000000, Updated data query index ordering
Total: 0.780000, Delta: 0.010000, Update re-index cache complete
Total: 0.790000, Delta: 0.010000, Update graph data query cache complete
Total: 0.790000, Delta: 0.000000, Update data source data query cache complete
Total: 0.790000, Delta: 0.000000, Update data query cache complete
Total: 0.790000, Delta: 0.000000, Update poller cache from query complete
Total: 0.790000, Delta: 0.000000, Automation execute data query complete
Total: 0.790000, Delta: 0.000000, Plugin hooks complete
TheWitness commented 4 years ago

This is generally a case where when you edit the Data Query, you have not joined the XML fields to the correct Data Source names. See the image below.

image

This is otherwise known as a training issue in most cases.

sysres-dev commented 4 years ago

I agree it's a common mistake to forget it, but it's not the case here. image

You can also validate if configuration is good or not by importing cacti_data_queryhuawei-_sfp-issue.xml from the xmls.zip file.

TheWitness commented 4 years ago

Okay, next step is to look for the data local entries. First bite the snmp_query_id and host_id of the query and graph. Then run this query:

SELECT * FROM data_local WHERE host_id = ? AND snmp_query_id = ?;

Are there any entries with a blank snmp_index?

sysres-dev commented 4 years ago

I have created manually 3 graphs on this device, all depending of data query 97. When i run this request, i have this :

MariaDB [cactiaao]> SELECT * FROM data_local WHERE host_id = 66361 AND snmp_query_id = 97;
Empty set (0.01 sec)

When i'm running whithout specifying an id :

MariaDB [cactiaao]> SELECT * FROM data_local WHERE host_id = 66361;
+---------+------------------+---------+---------------+------------+
| id      | data_template_id | host_id | snmp_query_id | snmp_index |
+---------+------------------+---------+---------------+------------+
| 1104654 |             1276 |   66361 |             0 |            |
| 1104655 |             1276 |   66361 |             0 |            |
| 1104656 |             1276 |   66361 |             0 |            |
+---------+------------------+---------+---------------+------------+
3 rows in set (0.00 sec)

It seems the data sources are not mapped with their data query.

sysres-dev commented 4 years ago

Hello there, do you need informations or testing ?

TheWitness commented 4 years ago

From what the Data Query is coming back with, there are a number of elements in the Data Query, that when you run "Verbose Query" come back with a blank index. What this translates to is a poorly written Data Query. Now there was a patch in 1.2.11 that fixed some issues with the REGEX's. Can you confirm that this is your release? If it is, you should look at your REGEX and keep toying with it until you get valid Indexes for all the members of the SNMP table.

sysres-dev commented 4 years ago

My cacti version is 1.1.38, i couldn't find the related issue you are describing in 1.2.11.

Data query index are parsed and extracted properly.

The data query is using thoses lines :

<oid_index>.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4</oid_index>
<oid_index_parse>OID/REGEXP:^.*\.2011\..*\.(.*)$</oid_index_parse>

The index is successfully extracted from the oid according to the index_parse lines :

Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850560' results: '16850560'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850561' results: '16850561'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850562' results: '16850562'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850563' results: '16850563'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850564' results: '16850564'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850565' results: '16850565'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850566' results: '16850566'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850567' results: '16850567'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850568' results: '16850568'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850569' results: '16850569'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850570' results: '16850570'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850571' results: '16850571'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850572' results: '16850572'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850573' results: '16850573'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850574' results: '16850574'
Total: 0.350000, Delta: 0.000000, index_parse at OID: '.1.3.6.1.4.1.2011.5.25.31.1.1.3.1.4.16850575' results: '16850575'

Since the index is extracted from oid, having blank values in the first part shouldn't matter unless you confirm the problem can come from here.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.