Cacti / cacti

Cacti ™
http://www.cacti.net
GNU General Public License v2.0
1.6k stars 397 forks source link

Boost - confuses indexes #5771

Open gadzet21 opened 2 weeks ago

gadzet21 commented 2 weeks ago

Hi, After updating Cacti from 1.2.17 to 1.2.27, I noticed strange jumps in the charts, and only when a given chart opens. Interestingly, not everywhere (not on every templates).

To check whether the current Cacti has no errors after the update - I installed a fresh version of Cacti 1.2.27 on a separate server (Debian). Spine, boost enabled. Standard plugins imported, localhost added. The error is the same and can be seen in the Main Poller - Collector Items chart. If no one is logged in to Cacti, data is collected, as can be seen below: image

But if I log in and view the chart - the indexes change their place! image

And the headlines: image

The chart itself: image

Logs: 2024-06-13 09:29:02 - SPINE: Poller[Main Poller] PID[138801] PT[140017500419776] Device[Localhost] HT[1] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.21] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getSNMP' '1', output: 0 2024-06-13 09:29:02 - SPINE: Poller[Main Poller] PID[138801] PT[140017500419776] Device[Localhost] HT[1] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.19] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getScriptServer' '1', output: 16 2024-06-13 09:29:02 - SPINE: Poller[Main Poller] PID[138801] PT[140017500419776] Device[Localhost] HT[1] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.45] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getScript' '1', output: 3 ... 2024-06-13 10:03:38 - BOOST The RRDpath template is getScript:getScriptServer:getSNMP 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_getsnmp_23.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 23, Sorted Results: 9 2024-06-13 10:03:38 - BOOST Local Data ID: 23, Archive Results: 0, Boost Results: 9 2024-06-13 10:03:38 - BOOST The RRDpath template is uactive:usleeping 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_active_10.rrd 2024-06-13 10:03:38 - BOOST The RRDpath template is script:script_server:snmp 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_snmp_4.rrd 2024-06-13 10:03:38 - BOOST The RRDpath template is active:garbage:sleeping 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_active_9.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 10, Sorted Results: 4 2024-06-13 10:03:38 - BOOST Local Data ID: 10, Archive Results: 0, Boost Results: 4 2024-06-13 10:03:38 - BOOST Local Data ID: 4, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 4, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 9, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 9, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST The RRDpath template is hosts:time 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_hosts_5.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 5, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 5, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST The RRDpath template is invalid:valid 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_active_8.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 8, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 8, Archive Results: 0, Boost Results: 3 2024-06-13 10:03:38 - BOOST The RRDpath template is Processes:Threads 2024-06-13 10:03:38 - BOOST The RRDpath is /usr/share/cacti/site/rra/127_0_0_1_rrdsprocessed_6.rrd 2024-06-13 10:03:38 - BOOST Local Data ID: 6, Sorted Results: 3 2024-06-13 10:03:38 - BOOST Local Data ID: 6, Archive Results: 0, Boost Results: 3

I suspect that boost, when fetching data from the database, indexes incorrectly.

Database dump, poller_output_boost table: image

Spine log: 2024-06-13 11:57:25 - SPINE: Poller[1] PID[144376] PT[140352935687872] Device[1] HT[1] DS[23] TT[0.71] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getScript' '1', output: 3 2024-06-13 11:57:25 - SPINE: Poller[1] PID[144376] PT[140352935687872] Device[1] HT[1] DS[23] TT[0.26] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getScriptServer' '1', output: 16 2024-06-13 11:57:25 - SPINE: Poller[1] PID[144376] PT[140352935687872] Device[1] HT[1] DS[23] TT[0.21] SS[0] SERVER: /usr/share/cacti/site/scripts/ss_cpoller.php ss_cpoller 'get' 'getSNMP' '1', output: 0

php /usr/share/cacti/site/poller_boost.php --verbose --debug --force DEBUG: Checking if Boost is ready to run. DEBUG: Last Runtime was 2024-06-13 09:47:33 (1718272053). DEBUG: Next Runtime is 2024-06-13 11:47:33 (1718279253). DEBUG: Records Found:214, Max Threshold:1000000. DEBUG: Time to Run Boost, Force Run is true! DEBUG: Parallel Process Setup Begins. DEBUG: Data Sources:17, Concurrent Processes:1 DEBUG: Parallel Process Setup Complete. Ready to spawn children. DEBUG: About to launch 1 processes. DEBUG: Launching Boost Process Number 1 Total[7.0214] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: /usr/share/cacti/site/poller_boost.php --child=1 --debug] 2024-06-13 10:05:04 - SYSTEM BOOST STATS: Time:9.03 RRDUpdates:214 2024-06-13 10:05:04 - SYSTEM BOOST DETAIL STATS: RRDUpdates:214, TotalTime:9.03, get_records:0, results_cycle:0, rrd_filename_and_template:0, rrd_lastupdate:0, rrdupdate:0, delete:0 DSSTATS: About to launch 1 processes. DSSTATS: Launching DSStats Process Number 1 for Type bmaster Total[9.0382] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: /usr/share/cacti/site/poller_dsstats.php --type=bchild --child=1 --debug] DSSTATS: Processing daily for Thread 0 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 179 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 180 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 181 DSSTATS: DAILY 2024-06-13 10:05:06 - SYSTEM DSSTATS STATS: Time:11.08 Type:DAILY Threads:1 RRDfiles:38 DSSes:50 RRDUser:0.00 RRDSystem:0.00 RRDReal:0.00

Regards Jacek

bmfmancini commented 2 weeks ago

I teresting I have not noticed any issues but I will start digging into this in the am

On Thu, Jun 13, 2024, 06:08 gadzet21 @.***> wrote:

php /usr/share/cacti/site/poller_boost.php --verbose --debug --force DEBUG: Checking if Boost is ready to run. DEBUG: Last Runtime was 2024-06-13 09:47:33 (1718272053). DEBUG: Next Runtime is 2024-06-13 11:47:33 (1718279253). DEBUG: Records Found:214, Max Threshold:1000000. DEBUG: Time to Run Boost, Force Run is true! DEBUG: Parallel Process Setup Begins. DEBUG: Data Sources:17, Concurrent Processes:1 DEBUG: Parallel Process Setup Complete. Ready to spawn children. DEBUG: About to launch 1 processes. DEBUG: Launching Boost Process Number 1 Total[7.0214] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: /usr/share/cacti/site/poller_boost.php --child=1 --debug] 2024-06-13 10:05:04 - SYSTEM BOOST STATS: Time:9.03 RRDUpdates:214 2024-06-13 10:05:04 - SYSTEM BOOST DETAIL STATS: RRDUpdates:214, TotalTime:9.03, get_records:0, results_cycle:0, rrd_filename_and_template:0, rrd_lastupdate:0, rrdupdate:0, delete:0 DSSTATS: About to launch 1 processes. DSSTATS: Launching DSStats Process Number 1 for Type bmaster Total[9.0382] DEBUG: About to Spawn a Remote Process [CMD: /usr/bin/php, ARGS: /usr/share/cacti/site/poller_dsstats.php --type=bchild --child=1 --debug] DSSTATS: Processing daily for Thread 0 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 179 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 180 PHP Warning: A non-numeric value encountered in /usr/share/cacti/site/lib/dsstats.php on line 181 DSSTATS: DAILY 2024-06-13 10:05:06 - SYSTEM DSSTATS STATS: Time:11.08 Type:DAILY Threads:1 RRDfiles:38 DSSes:50 RRDUser:0.00 RRDSystem:0.00 RRDReal:0.00

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5771#issuecomment-2165207418, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTCBH3RSXRTIWBQOEODZHFVRHAVCNFSM6AAAAABJH44SF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRVGIYDONBRHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

bmfmancini commented 2 weeks ago

I have not been able to reproduce this

TheWitness commented 2 weeks ago

It could be that the ifName is not the primary key and a reindex was done causing a mixup. You might want to reindex.

gadzet21 commented 1 week ago

I installed another Cacti on another server. I started with version 1.2.19 to check if the above-mentioned problem occurs. After switching from version 1.2.22 to 1.2.23 the following occurred:

image

I'm currently checking what change could have caused this. @bmfmancini did you have it running "Enable On-demand RRD Updating" ? @TheWitness I did a reindex, deleted the graph and created it again

bmfmancini commented 1 week ago

I have remote collectors so boost is enabled by default

Is this happening with all graph types?

gadzet21 commented 1 week ago

No, in production after updating Cacti, the problem occurs only in 2 templates:

As for Poller Items - do your data in the database match those presented in the chart? For me, when boost is running, it presents the data incorrectly (not in this order): POLLER: Poller[Main Poller] PID[160779] Device[localhost] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.19] SERVER: /usr/share/cacti/scripts/ss_cpoller.php ss_cpoller 'get' 'getScriptServer' '1', output: 16 POLLER: Poller[Main Poller] PID[160779] Device[localhost] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.25] SERVER: /usr/share/cacti/scripts/ss_cpoller.php ss_cpoller 'get' 'getScript' '1', output: 3 POLLER: Poller[Main Poller] PID[160779] Device[localhost] DS[Main Poller - Collector Items] Graphs[Main Poller - Collector Items] TT[0.7] SERVER: /usr/share/cacti/scripts/ss_cpoller.php ss_cpoller 'get' 'getSNMP' '1', output: 0

Graph: image

If I turn it off "On-demand RRD Updating": image

TheWitness commented 1 week ago

Are there any errors in the Cacti.log? What it outputs in those POLLER entries you showed above is how it is stored into the database. If you are saying that the output from the POLLER output above is what is wrong, then I'm left scratching my head.

TheWitness commented 1 week ago

Take the Cacti Stats device and enable debug for it and then wait for a bunch of stuff to start appearing in the Cacti log. See if it ever echos to the Cacti.log the wrong information.

gadzet21 commented 1 week ago

Data is collected correctly in the database (poller_output_boost).

I placed an additional line for logging data in the boost_rrdtool_function_update function in the lib/boost.php file:

if ($valid_entry) {
    if ($rrd_update_template != '') {
        rrdtool_execute("update $rrd_path $update_options --template $rrd_update_template $rrd_update_values", false, RRDTOOL_OUTPUT_STDOUT, $rrdtool_pipe, 'BOOST');
        cacti_log("update $rrd_path $update_options $rrd_update_template $rrd_update_values", false, 'BOOST', POLLER_VERBOSITY_MEDIUM); //ADD
    } else {
        cacti_log("update $rrd_path $update_options $rrd_update_template $rrd_update_values", false, 'BOOST', POLLER_VERBOSITY_MEDIUM);
        rrdtool_execute("update $rrd_path $update_options $rrd_update_values", false, RRDTOOL_OUTPUT_STDOUT, $rrdtool_pipe, 'BOOST');
        }
    return 'OK';
}

If boost itself (automatically) updates the data after exceeding the conditions agreed in the configuration, the result in the logs is as follows: 2024-06-20 09:29:10 - BOOST update /usr/share/cacti/rra/localhost_getsnmp_23.rrd --skip-past-updates getScript:getScriptServer:getSNMP 1718866741:1:9:0 1718866801:1:9:0 1718866862:1:9:0 1718866921:1:9:0 1718866982:1:9:0 1718867042:1:9:0 1718867101:1:9:0 1718867162:1:9:0 1718867221:1:9:0 1718867281:1:9:0 1718867342:1:9:0 1718867401:1:9:0 1718867462:1:9:0 1718867522:1:9:0 1718867581:1:9:0 1718867642:1:9:0 1718867701:1:9:0 1718867761:1:9:0 1718867822:1:9:0 1718867881:1:9:0 1718867942:1:9:0 1718868002:1:9:0 1718868061:1:9:0 1718868122:1:9:0 1718868181:1:9:0 1718868241:1:9:0 1718868302:1:9:0 1718868361:1:9:0 1718868422:1:9:0 1718868481:1:9:0 1718868541:1:9:0

But if I go to graph_view.php where the graph is located, the logs show: 2024-06-20 09:44:28 - BOOST update /usr/share/cacti/rra/localhost_getsnmp_23.rrd --skip-past-updates getScript:getScriptServer:getSNMP 1718869441:0:1:9

As you can see, the results themselves have been sorted and do not match the names. I wonder why this doesn't happen everywhere... I pursue the matter further.

TheWitness commented 1 week ago

If you turn off boost does the graph plot correctly?

TheWitness commented 1 week ago

I just read that post so it's the independent graphing between boost updates that scrambles the results that's good to know I think we can fix it.

gadzet21 commented 1 week ago

I think I have a lead at this point.

In the lib/boost.php file, the boost_process_poller_output function sorts the value and names with the command: $sorted = boost_array_orderby($results, 'timestamp', SORT_ASC, 'rrd_name', SORT_ASC); However, later the names are downloaded again with the command: $path_template = boost_get_rrd_filename_and_template($local_data_id); and they are no longer sorted in the same way as in the $sorted variable So we have a discrepancy: the values ​​of $outbuf and $rrd_tmpl

gadzet21 commented 1 week ago

I already know the details :) The boost_array_orderby function uses array_multisort ASC sorting and the boost_get_rrd_filename_and_template function uses order by ASC sorting. The difference is in the sorting of character sizes - uppercase and lowercase. boost_array_orderby uses the php array_multisort function which is case sensitive (according to ASCII) boost_get_rrd_filename_and_template uses mysql order by ASC which is case insensitive

Change proposal: In the boost_array_orderby function instead of: $tmp[$key] = $row[$field]; change to: $tmp[$key] = strtolower($row[$field]);

PS, it may also be worth paying attention in the script to how the collation value (string comparison method) is set in the database.