Cacti / cacti

Cacti ™
http://www.cacti.net
GNU General Public License v2.0
1.63k stars 405 forks source link

When replicating data during installation/upgrade, system may appear to hang #3816

Open eschoeller opened 4 years ago

eschoeller commented 4 years ago

I have 5 distributed pollers. At the 77% mark in the install process it starts updating rows on the remote databases. The installer GUI makes no mention of this, even though the cacti.log is still churning out information, and an strace of the background installer process is going nuts. It might be a good idea to add a bit more feedback to the user via the web GUI at this point in the installer, so we don't think it's totally hung.

netniV commented 4 years ago

I would agree, though I didn't really touch the remote update stuff. Do you know what is actually occurring ? If it is the normal DB upgrade stuff, there should still be feedback going on

TheWitness commented 4 years ago

I think it's doing a full sync. We should consider this.

eschoeller commented 4 years ago

Yeah that’s what it looked like to me from the logs. A lot of the “due to no rows found” type messages as I recall. I could scan back in the log and give you all the exact messages, if that’s helpful.

Eric.

On Sep 24, 2020, at 12:38 PM, TheWitness notifications@github.com wrote:

 I think it's doing a full sync. We should consider this.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

eschoeller commented 4 years ago

here's the relevant log entries:

2020/09/21 22:44:05 - INSTALL: always: Spawning background process: /usr/local/php/bin/php '/cacti/cacti-1.2.14-prod/install/background.php' 1600749845.2585
2020/09/21 22:44:05 - INSTALL: Checking arguments
2020/09/21 22:44:05 - INSTALL: always: Setting PHP Option max_execution_time = 0
2020/09/21 22:44:05 - INSTALL: always: Setting PHP Option memory_limit = -1
2020/09/21 22:44:06 - INSTALL: always: Starting UPGRADE Process for v1.2.14
2020/09/21 22:44:06 - INSTALL: always: No tables where found or selected for conversion
2020/09/21 22:44:06 - INSTALL: always: Switched from  to /tmp/cduGVq6KL
2020/09/21 22:44:06 - INSTALL: always: NOTE: Using temporary file for db cache: /tmp/cduGVq6KL
2020/09/21 22:44:06 - INSTALL: always: Upgrading from v1.2.9 (DB 1.2.14 (DB: 1.2.9)) to v1.2.11
2020/09/21 22:44:07 - INSTALL: always: Upgrading from v1.2.11 (DB 1.2.11) to v1.2.14
2020/09/21 22:44:10 - INSTALL: always: No templates were selected for import
2020/09/21 22:44:10 - INSTALL: always: Finished UPGRADE Process for v1.2.14
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_graph_rule_items Replicated to Remote Poller 2 With 181 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_graph_rules Replicated to Remote Poller 2 With 91 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_match_rule_items Replicated to Remote Poller 2 With 162 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_snmp Not Replicated to Remote Poller 2 Due to No Rows Found
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_snmp_items Not Replicated to Remote Poller 2 Due to No Rows Found
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_templates Not Replicated to Remote Poller 2 Due to No Rows Found
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_tree_rule_items Replicated to Remote Poller 2 With 5 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table automation_tree_rules Replicated to Remote Poller 2 With 3 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table data_input Replicated to Remote Poller 2 With 99 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table host_template Replicated to Remote Poller 2 With 66 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table host_template_graph Replicated to Remote Poller 2 With 716 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table host_template_snmp_query Replicated to Remote Poller 2 With 185 Rows Updated
2020/09/21 22:44:10 - CMDPHP NOTE: Table snmp_query Replicated to Remote Poller 2 With 89 Rows Updated
2020/09/21 22:44:11 - CMDPHP NOTE: Table data_input_fields Replicated to Remote Poller 2 With 858 Rows Updated
2020/09/21 22:44:11 - CMDPHP NOTE: Table poller Replicated to Remote Poller 2 With 6 Rows Updated
2020/09/21 22:44:11 - CMDPHP NOTE: Table version Replicated to Remote Poller 2 With 1 Rows Updated
2020/09/21 22:44:11 - CMDPHP NOTE: Table user_auth Replicated to Remote Poller 2 With 10 Rows Updated

Then finally:

2020/09/21 22:47:27 - INSTALL-SYNC: always: Remote Data Collector with name 'thorn-a poller' and id 2 completed Full Sync.
2020/09/21 22:47:27 - INSTALL-SYNC: always: Remote Data Collector with name 'thorn-b poller' and id 3 completed Full Sync.
2020/09/21 22:47:27 - INSTALL-SYNC: always: Remote Data Collector with name 'thorn-c poller' and id 4 completed Full Sync.
2020/09/21 22:47:27 - INSTALL-SYNC: always: Remote Data Collector with name 'thorn-d poller' and id 5 completed Full Sync.
2020/09/21 22:47:27 - INSTALL-SYNC: always: Remote Data Collector with name 'thorn-e poller' and id 6 completed Full Sync.
2020/09/21 22:47:27 - INSTALL: always: Installation was started at 2020-09-22 04:44:05, completed at 2020-09-22 04:47:27

So, yeah, it was a full sync.

TheWitness commented 3 years ago

@netniV, do you have any time to work on this one?

netniV commented 3 years ago

This will need some additional parameters to log with installer in mind not just standard poller sync.

Grabbing remote log info may be harder. But we could always post back via the remote agent to indicate various statuses.

Some more discussion should take place on this before we start implementing any changes and given the nature of the changes, I will probably prefer to get them in develop so we can properly test things out.

We can then consider back porting if the final solution isn't that troublesome.

TheWitness commented 3 years ago

That makes sense. I'll take it off the 1.2.16 list then.

TheWitness commented 3 years ago

Added enhancement tag so that the stalebot does not mess with it.

TheWitness commented 3 years ago

There are two things we are missing from the progress indicator: 1) Number of tables to migrate 2) Number of pollers to sync

We should add that to the step percentage calculation.

Additionally, we should have a bias based upon: 1) Number of rows in all the tables to be migrated 2) Number of poller items for each poller to be replicated

My $0.02 on this topic.

netniV commented 3 years ago

There may be need for a sub-status... to display secondary percentages. That way, they can be calculated independently of the overall steps.

TheWitness commented 2 years ago

@netniV there are two of these now...