Cacti / cacti

Cacti ™
http://www.cacti.net
GNU General Public License v2.0
1.65k stars 406 forks source link

Graph Template Items may have duplicated entries #4363

Closed bernisys closed 2 years ago

bernisys commented 3 years ago

Describe the bug

As i understand it, a template declaration inside the DB is identified by the hashes being set and local_graph_template_item_id=0 and local_graph_id=0. While a real graph contains no hash and has local_graph_template_item_id pointed to the row that contains the template declaration from above and local_graph_id pointing to the graph declaration, as well a task_item_id for the elements of a graph, which are connected to a queried item.

In our DB i have found some strange duplicate entries, which relate to different local IDs than the original template. Here's an example .. watch how for example in row id=10488739 there is a duplicate of the previous row, but the local_graph_template_item_id LGTII points suddenly to a different "master ID". Instead of the correct entry 477305 it uses now 10488701, which belonged to a graph but is not even there any more, which can in turn probably cause trouble when updating a graph template and the changes are not propagated properly into all graphs due to this shift in relation.

And there are many more of those duped rows with TII=0 for the same template, with always the same pattern. I still have to check our other templates, there were a few more which showed this pattern.

MariaDB [cacti]> select id, hash, local_graph_template_item_id LGTII, local_graph_id, task_item_id TII, color_id, graph_type_id type, cdef_id, text_format, sequence from graph_templates_item where graph_template_id=123 and id>10488660 and id<10488770;
+----------+------+----------+----------------+---------+----------+------+---------+-------------------------------------------------+----------+
| id       | hash | LGTII    | local_graph_id | TII     | color_id | type | cdef_id | text_format                                     | sequence |
+----------+------+----------+----------------+---------+----------+------+---------+-------------------------------------------------+----------+
| 10488661 |      |   477313 |         562436 | 1137564 |        0 |    9 |       0 | Max:                                            |        8 |
| 10488662 |      |   477314 |         562436 | 1137563 |        1 |    4 |       0 | Interface |query_ifDescr| - Type |query_ifType| |        9 |
| 10488663 |      |   477315 |         562436 | 1137563 |        0 |    1 |       0 | Link ID: |query_ifAlias|                        |       10 |
| 10488664 |      |   477316 |         562436 | 1137564 |        0 |    1 |       0 | Speed: |query_ifHighSpeed| Mbps                 |       11 |
| 10488665 |      |   488319 |         562436 | 1137564 |      442 |    7 |      30 | no data                                         |       12 |
| 10488682 |      |   477305 |         562438 | 1137567 |       22 |    7 |       2 | Inbound                                         |        1 |
| 10488684 |      |   477306 |         562438 | 1137567 |        0 |    9 |       0 | Cur:                                            |        2 |
| 10488686 |      |   477307 |         562438 | 1137567 |        0 |    9 |       0 | Avg:                                            |        3 |
| 10488688 |      |   477308 |         562438 | 1137567 |        0 |    9 |       0 | Max:                                            |        4 |
| 10488689 |      |   477312 |         562438 | 1137568 |        0 |    9 |       0 | Avg:                                            |        7 |
| 10488691 |      |   477313 |         562438 | 1137568 |        0 |    9 |       0 | Max:                                            |        8 |
| 10488693 |      |   477314 |         562438 | 1137567 |        1 |    4 |       0 | Interface |query_ifDescr| - Type |query_ifType| |        9 |
| 10488695 |      |   477315 |         562438 | 1137567 |        0 |    1 |       0 | Link ID: |query_ifAlias|                        |       10 |
| 10488696 |      |   477316 |         562438 | 1137568 |        0 |    1 |       0 | Speed: |query_ifHighSpeed| Mbps                 |       11 |
| 10488698 |      |   488319 |         562438 | 1137568 |      442 |    7 |      30 | no data                                         |       12 |
| 10488737 |      |   477305 |         562441 | 1137574 |       22 |    7 |       2 | Inbound                                         |        1 |
| 10488739 |      | 10488701 |         562441 |       0 |       22 |    7 |       2 | Inbound                                         |        1 |
| 10488741 |      |   477306 |         562441 | 1137574 |        0 |    9 |       0 | Cur:                                            |        2 |
| 10488743 |      | 10488704 |         562441 |       0 |        0 |    9 |       0 | Cur:                                            |        2 |
| 10488745 |      |   477307 |         562441 | 1137574 |        0 |    9 |       0 | Avg:                                            |        3 |
| 10488747 |      | 10488706 |         562441 |       0 |        0 |    9 |       0 | Avg:                                            |        3 |
| 10488749 |      |   477308 |         562441 | 1137574 |        0 |    9 |       0 | Max:                                            |        4 |
| 10488751 |      | 10488708 |         562441 |       0 |        0 |    9 |       0 | Max:                                            |        4 |
| 10488753 |      |   477310 |         562441 | 1137576 |       21 |    5 |       0 | Outbound                                        |        5 |
| 10488755 |      | 10488711 |         562441 |       0 |       21 |    5 |       0 | Outbound                                        |        5 |
| 10488757 |      |   477311 |         562441 | 1137576 |        0 |    9 |       0 | Cur:                                            |        6 |
| 10488759 |      | 10488713 |         562441 |       0 |        0 |    9 |       0 | Cur:                                            |        6 |
| 10488761 |      |   477312 |         562441 | 1137576 |        0 |    9 |       0 | Avg:                                            |        7 |
| 10488763 |      | 10488715 |         562441 |       0 |        0 |    9 |       0 | Avg:                                            |        7 |
| 10488764 |      |   477305 |         562442 | 1137577 |       22 |    7 |       2 | Inbound                                         |        1 |
| 10488766 |      |   477313 |         562441 | 1137576 |        0 |    9 |       0 | Max:                                            |        8 |
| 10488767 |      | 10488701 |         562442 |       0 |       22 |    7 |       2 | Inbound                                         |        1 |
| 10488769 |      | 10488718 |         562441 |       0 |        0 |    9 |       0 | Max:                                            |        8 |
+----------+------+----------+----------------+---------+----------+------+---------+-------------------------------------------------+----------+

And there was another problem with an apparently deleted graph of the same template, which made the template itself contain all duplicate items, because the above mentioned IDs were set to zero, hence cacti identified those entries as part of a template. I fixed that already simply by removing those problematic entries.

Fixing should be halfway doable by scanning the DB for entries with identical sequence numbers for the same local_graph_id and deleting the duplicates where the task ID is zero ... i hope at least that there is no other relation which needs to be updated in other tables. If you can you help me out on that, then i can try to create a script which scans for such entries and provide it on my github repo, so that others can use it to check their DB and correct the entries if necessary.

What i have also seen is that the entry with id=1 in the table graph_templates_graph was overwritten with some other graph template's data, this looks like a related problem - not sure if that was fixed in the meantime, you have more insight than me probably. Would be nice if someone could enlighten me here as well.

I feel that this could this be an aftermath related to the old template edit bug that was fixed in 1.2.17 - #4237

Thanks!

bernisys commented 3 years ago

We still get graphs containing duplicated items, so i did some further investigation ...

As an example, this is how the graphs look like: image

You see the legend-items all doubled, also the graphs are plotted twice. so I checked the DB once more:

MariaDB [cacti]> select local_graph_id, count(1) from graph_templates_item where graph_template_id=38 group by local_graph_id limit 10;
+----------------+----------+
| local_graph_id | count(1) |
+----------------+----------+
|              0 |       25 |  <- that's the template declaration - with local_graph_id == 0
|            230 |       25 | <-- these are all fine, 25 items in the graph, like in the template
|            231 |       25 |
|            232 |       25 |
|            233 |       25 |
|            234 |       25 |
|            235 |       25 |
|            236 |       25 |
|            237 |       25 |
|            238 |       25 |
+----------------+----------+

Then i searched for devices with different graph item count and found this:

MariaDB [cacti]> select local_graph_id, items  from (select local_graph_id, count(1) as items from graph_templates_item where graph_template_id=38 group by local_graph_id) as result where items != 25 limit 10;
+----------------+-------+
| local_graph_id | items |
+----------------+-------+
|          65757 |    22 |  <-- too small item count
|         184054 |     3 |  <-- way too small
|         200848 |    14 |  <-- *sigh*
|         448170 |    13 |  <-- what happens here?!
|         547412 |    50 |  <-- aaand duplicates
|         547420 |    50 |
|         547422 |    50 |
|         547430 |    50 |
|         547432 |    50 |
|         547441 |    50 |
+----------------+-------+

65757 is a graph my colleague remembers, he did a device-cleanup recently (i.e. removing old graphs and re-discover the device)

547412 and following are the graphs from devices he re-discovered recently during the cleanup

All the other like 250k traffic graphs are fully functional and look okay:

MariaDB [cacti]> select items, count(1)  from (select local_graph_id, count(1) as items from graph_templates_item where graph_template_id=38 group by local_graph_id) as result group by items;
+-------+----------+
| items | count(1) |
+-------+----------+
|     3 |        1 |
|    13 |        1 |
|    14 |        1 |
|    22 |        2 |
|    25 |   245375 |    <- major amount of graphs is okay
|    48 |        2 |
|    49 |        1 |
|    50 |     4424 |
+-------+----------+

Checking the graphs-table:

MariaDB [cacti]> select id, local_graph_template_graph_id as LGTGI, local_graph_id, graph_template_id as GTI, title_cache  from graph_templates_graph where graph_template_id = 38 and local_graph_id in (65757, 184054, 200848, 448170, 547412);
+--------+-------+----------------+-----+------------------------------------------------------------------------------+
| id     | LGTGI | local_graph_id | GTI | title_cache                                                                  |
+--------+-------+----------------+-----+------------------------------------------------------------------------------+
|  65895 |   186 |          65757 |  38 | xxxxxxxxxxx10-14 - Interface - Ethernet100/1/7 Traffic (xxxxxxxx NIC1_eth1)  |
| 184189 |   186 |         184054 |  38 | xxxxxxxxxxxx-7 - Interface - Ethernet115/1/10 Traffic (xxxxxxxx A_vmnic7)    |
| 200983 |   186 |         200848 |  38 | xxxxxxxxxxxx1-o7 - Interface - Ethernet111/1/3 Traffic (xxxxxxxx NIC2_prod)   |
| 448349 |   186 |         448170 |  38 | xxxxxxtest5ksb - Interface - Ethernet109/1/14 Traffic (xxxxxxxx (B&R) |
| 547612 |   186 |         547412 |  38 | xxxxxxxxxxxx10-13 - Interface - Ethernet104/1/3 Traffic (xxxxxxxx NIC1_eth1)  |
+--------+-------+----------------+-----+------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

So the graphs look valid, but nothing is displayed when i point my browser to the IDs with less than 25 items ... and the 547412 one contains all duplicates - like in the image on top of this post.

Any ideas?

gj00354347 commented 3 years ago

Hi Developers ,

This ticket is opened by my colleague .

yes we are doing a clean up operation in our cacti . there are many graphs with missing legend like |query_ifAlias| , |query_ifHighSpeed| or rrd file is not being updated , may be because the db connection was broken at some point . so we though of finding such graphs and recreating a new one and merging the data of old graphs (ie doing mv old_rrd new_rrd) . This is being accomplished by reindexing through cli and then merging the old_rrd with new_rrd with a simple mv command after finding out similar graph title. This cacti has some inbuilt mechanism which helps it to find out whether the data is missing or |query_ifAlias| or |query_ifHighSpeed| is missing then it creates new graph . After the new graph is created we find duplicate graph title and merge old_rrd with new_rrd . PFA snap

Cacti-duplicate-legend

But what we are observing now is that whenever cacti does reindexing and we merge the data , we find the legends or gprint all are duplicated like , curr val , max val ,avg val or be it legend like |query_ifHishSpeed| or |query_ifAlias| and that too in a very haphazard manner which makes the values in grpint almost illegible. And this happens only when there is already a graph is existing and its data is merged with new one because of the above mentioned anomaly .But if the reindex creates graph for interfaces whose graphs are not there then it is all ok like gprint and legends are all ok .

so we suspect may be there is some kind of bug in reindex (because what we do is like mv old_rrd new_rrd and I am pretty sure it does not contain gprint or legends) . Can you please help us like which part of reindex we need to debug ??

Best Regards, Gopal

netniV commented 3 years ago

Use the graph debug options to see the RRDtool command being used, this is often a big hint as to what is occurring.

gj00354347 commented 3 years ago

Hi ,

I tried that it says all OK . But even in rrd ttol debug option you will see all items getting duplicated and what surprises me is that in the template we do not have this. PFB rrd tool debug ouput (the device name and query_ifalias have been changed)

/bin/rrdtool graph - \ --imgformat=PNG \ --start='-86400' \ --end='-300' \ --pango-markup \ --title='TEST - Interface - Ethernet110/1/13 Traffic (alias of that interface)' \ --vertical-label='bits per second' \ --slope-mode \ --base=1000 \ --height=200 \ --width=700 \ --rigid \ --alt-autoscale-max \ --lower-limit='0' \ COMMENT:"From 2021-08-18 17\:25\:21 To 2021-08-19 17\:20\:21\c" \ COMMENT:" \n" \ --color BACK#F3F3F3 \ --color CANVAS#FDFDFD \ --color SHADEA#CBCBCB \ --color SHADEB#999999 \ --color FONT#000000 \ --color AXIS#2C4D43 \ --color ARROW#2C4D43 \ --color FRAME#2C4D43 \ --border 1 --font TITLE:11:'Arial' \ --font AXIS:8:'Arial' \ --font LEGEND:8:'Courier' \ --font UNIT:8:'Arial' \ --font WATERMARK:6:'Arial' \ --slope-mode \ --watermark 'Vodafone OSS-Tools Cacti® prod' \ DEF:a='/opt/SP/cacti/cacti/rra/564/555813.rrd':'traffic_in':AVERAGE \ DEF:b='/opt/SP/cacti/cacti/rra/564/555813.rrd':'traffic_in':MAX \ DEF:c='/opt/SP/cacti/cacti/rra/564/555813.rrd':'traffic_out':AVERAGE \ DEF:d='/opt/SP/cacti/cacti/rra/564/555813.rrd':'traffic_out':MAX \ CDEF:cdefc='a,8,' \ CDEF:cdefd='a,8,' \ CDEF:cdefe='a,8,' \ CDEF:cdeff='b,8,' \ CDEF:cdefg='a,8,' \ CDEF:cdefi='a,1000,/,1250,/' \ CDEF:cdefj='b,1000,/,1250,/' \ CDEF:cdefcd='c,8,' \ CDEF:cdefzzcf='d,8,*' \ CDEF:cdefcj='d,1000,/,1250,/' \ CDEF:cdefec='a,UN,INF,0,IF' \ CDEF:cdefed='a,UN,INF,0,IF' \ COMMENT:' Current / Average / Maximum / Total\n' \ COMMENT:' Current / Average / Maximum / Total\n' \ AREA:cdefc#00CF007F:'Inbound ' \ AREA:cdefd#00CF007F:'Inbound ' \ LINE1:cdefe#41A317FF: \ LINE1:cdeff#41A317FF: \ GPRINT:cdefg:LAST:'%8.2lf %s' \ GPRINT:cdeff:LAST:'%8.2lf %s' \ GPRINT:cdefi:LAST:'%8.1lf %%' \ GPRINT:cdefj:LAST:'%8.1lf %%' \ GPRINT:cdefg:AVERAGE:'/%8.2lf %s' \ GPRINT:cdeff:AVERAGE:'/%8.2lf %s' \ GPRINT:cdefg:MAX:'/%8.2lf %s' \ GPRINT:cdeff:MAX:'/%8.2lf %s' \ GPRINT:cdefi:MAX:'%8.1lf %%' \ GPRINT:cdefj:MAX:'%8.1lf %%' \ COMMENT:'/ 1.72 GB' \ COMMENT:'/ 544.76 MB' \ HRULE:0#D4A017FF:'95th\:':dashes=8,6 \ HRULE:4886.56#D4A017FF:'95th\:':dashes=8,6 \ COMMENT:'0\n' \ COMMENT:'0\n' \ AREA:cdefe#002A974C:'Outbound' \ AREA:cdefcd#002A974C:'Outbound' \ LINE1:cdefe#002A97FF: \ LINE1:cdefzzcf#002A97FF: \ GPRINT:cdefg:LAST:'%8.2lf %s' \ GPRINT:cdefzzcf:LAST:'%8.2lf %s' \ GPRINT:cdefi:LAST:'%8.1lf %%' \ GPRINT:cdefcj:LAST:'%8.1lf %%' \ GPRINT:cdefg:AVERAGE:'/%8.2lf %s' \ GPRINT:cdefzzcf:AVERAGE:'/%8.2lf %s' \ GPRINT:cdefg:MAX:'/%8.2lf %s' \ GPRINT:cdefzzcf:MAX:'/%8.2lf %s' \ GPRINT:cdefi:MAX:'%8.1lf %%' \ GPRINT:cdefcj:MAX:'%8.1lf %%' \ COMMENT:'/ 1.72 GB' \ COMMENT:'/ 42.67 MB' \ HRULE:0#C04000FF:'95th\:':dashes=8,6 \ HRULE:5315.7#C04000FF:'95th\:':dashes=8,6 \ COMMENT:'0\n' \ COMMENT:'0.01\n' \ AREA:cdefec#ffffc0FF:'no data \n' \ AREA:cdefed#ffffc0FF:'no data \n' \ TEXTALIGN:right \ TEXTALIGN:right \ COMMENT:'Speed (Mbit/s)\: 1000\n' \ COMMENT:'Speed (Mbit/s)\: 1000\n' \ COMMENT:'Interface Alias\: Interface Alias\n' \ COMMENT:'Interface Alias\: Interface Alias\n' RRDtool Command lengths = 3037 charaters. RRDtool Says: OK

Does the above output gives you any clue ??

Best Regards, Gopal

gj00354347 commented 3 years ago

Hi Team ,

is the above info OK ?? Or do I need to provide any other info ?? actually this has stopped clean up in our environment .

Best Regards, Gopal

netniV commented 3 years ago

Can you post a picture of your graph template definitions? If you are on 1.2.18, this will show the CDEF and GRPINT selections in the table view. For earlier versions, you'd have to go into each item.

gj00354347 commented 3 years ago

here you go

image image

let me know if anything else required .

Best Regards, Gopal

netniV commented 3 years ago

When you are editing the graph, does that show the same definition?

gj00354347 commented 3 years ago

yes I Just tried doing that , Management ->Graph management -> Edit Graph template (from Top right) and it shows the same definition of 25 items which i have published before . I provide a snap of what I see

image image

netniV commented 3 years ago

Have you tried to re-apply the template to the graph?

gj00354347 commented 3 years ago

HI

I tried re applying graph template but no luck . It's still the same , yes i had put cacti in debug mode from settings so that i can capture something helpful . Below is the log

2021-09-02 18:36:19 - WEBUI Obtaining 'Graph Template' cache 2021-09-02 18:36:19 - WEBUI Fetch InitRows:-1, Hash:e60e1f7cf2266bbdc935657a2e9ef52b 2021-09-02 18:36:19 - WEBUI Found Valid 'Graph Template' priming cache 2021-09-02 18:36:58 - WEBUI Obtaining 'Graph Template' cache 2021-09-02 18:36:58 - WEBUI Fetch InitRows:-1, Hash:e60e1f7cf2266bbdc935657a2e9ef52b 2021-09-02 18:36:58 - WEBUI Found Valid 'Graph Template' priming cache

can we have a call where you can look more closely and deeply into the issue and I invite you and Engineering from my side .

TheWitness commented 3 years ago

When re-importing the template, did you select the 'Remove Orphans' option? Don't do this unless you have a good backup of the graph_templates_item table.

gj00354347 commented 3 years ago

thanks Witness for this suggestion .

I chose that particular device and graph template which is showing this behaviour under management >graphs and then applying device and graph template filter and then I ticked ORPAN GRAPH but saw no orphan graph items for that particular device and particular graph template . Capture

and then applied graph template but it did not fix the issue .

did i do it right or you meant something else when you said ""remove orphan options

best regards, Gopal

gj00354347 commented 3 years ago

Hi Team ,

can we have a call together may be tomorrow at 11:00 AM CET or you may suggest a right time which suits you and I drop a mail to you with invitation and yes is there any probable date of release of 1.2.19 .

many thanks, Gopal

netniV commented 3 years ago

Keep your eye on this page: https://github.com/Cacti/cacti/milestones?direction=asc&sort=due_date

If anything changes, we update the milestones.

TheWitness commented 3 years ago

That's not the same orphan that I was speaking of. Look at this page:

image

gj00354347 commented 3 years ago

Thanks Witness for clarifying the things .

yes in snap I see you are importing only Preview , am I supposed to import only preview or the whole template ??

and yes This template "Interface Traffic is already in our production " so we do not have to import from any other environment .so below is my questions
Q1. do you want me to take an export of this template from prod env and then import by keeping "Remove Orphan Graph Items" OR shall I take an export of the same template from Test and then import in Prod cacti ENV by keeping "Remove Orphan Graph Items" .

And is it really required as I am already having the template in prod and whenever I create new graphs it renders perfect graph , it creates mess only when I do reindexing .

Please suggest further.

Best Regards, Gopal

TheWitness commented 3 years ago

Remove orphans is only when your Template is Virgin and the way you want all your graphs to look. It will remove any graph items that don't match your golden template. So, it's important to export templates when they get to the point you want them to be. Otherwise, you might bring in a stray from the field that breaks all your customization's.

gj00354347 commented 3 years ago

HI Larry ,

yes I exported graph template from prod and then imported into prod without using "Remove orphan graph item option " and yes it did not fix the issue .

Then I tried the import with "Remove orphan graph items " , alas It also did not fix the issue . I see same duplicate entries . PFB snap duplicate-entries-in-graphs

yes it did not break the good graphs , even the one which belongs to same device .

TheWitness commented 3 years ago

You need to first ensure that the template is "exactly" like you want it. Edit the Template and see if there are duplicate there. My guess is that there are. Also, can you check a few graphs and see if they are all like this or just a few of them. We might need a database dump to drill down on this more. If the Template itself is damaged, fix it, and then Export it and Re-import it using the Remove Orphans option.

gj00354347 commented 3 years ago

Thanks Larry ,

Actually I have observed the graph template many times and there are no duplicates, may be your feeling stems from the image I have provided in this ticket . But if you see the data sources do appear more than once but the CDEF is different like in traffic for Last or current value we have 2 appearances but one is only Last and other one is interface utilisation percentage and same goes for max . We have same representation for traffic out .

And yes if the template is broken then all The new graphs should also be broken but this doesn’t happen . New graphs that we create are absolutely fine . The problem arises when we do clean up , that is some graphs in cacti are missing |query_ifHighspeed| or |query_ifAlias| Or data is not being populated but the interface is all up . There we do the reindexing , and some internal code of cacti catches those issues of missing speed or interface alias or missing data and creates new graphs and there we observe these issues of duplicate items and yes it is worthy to mention not in all cases sone of the broken graphs for whom Cacti creates new graphs are also ok .

Regarding dump of database shall I dump the graph_template table of cacti database and will That do or you want a whole db dump . So please let me know the tables of your interest . And In case you need the whole db backup then can you please help me or guide Me like how can I mask the necessary info like Ip or snmp comm string or other visible info In graph title or name as providing them as it is will be kind of violation of privacy .

many thanks and yes have a nice weekend and yes please take some rest and spare some Time to enjoy weekend. (Working even on weekends 😀😀 )

Best Regards, Gopal

TheWitness commented 3 years ago

Upload the template image and xml file. I just need to see the first two sections of the template. Thanks!

TheWitness commented 3 years ago

Hmm, you already uploaded it. Okay. So, wanting that template XML.

TheWitness commented 3 years ago

Run this query and replace local_graph_id = ? with one of the impacted local_graph_id's

SELECT id, hash, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, color_id, graph_type_id type, 
cdef_id, text_format, sequence 
FROM graph_templates_item 
WHERE local_graph_id = ?
TheWitness commented 3 years ago

Then, run this one, replacing ? with the graph_template_id

SELECT id, hash, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, color_id, graph_type_id type, 
cdef_id, text_format, sequence 
FROM graph_templates_item 
WHERE local_graph_id = 0
AND graph_template_id = ?
gj00354347 commented 3 years ago

HI Larry ,

tried to upload xml file but couldn't due to some restriction in place on this portal , then changed the extension to .txt to avoid but couldn't succeed , so put the whole text here that also didn't work as i do not know why the tags gets auto removed . so i sent the file on your mail . if you could please put it here , that will be great for community .

for db query i update you soon.

TheWitness commented 3 years ago

You've got to zip it first.

gj00354347 commented 3 years ago

thanks here you go graph-template.zip

gj00354347 commented 3 years ago
SELECT id, hash, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, color_id, graph_type_id type, 
cdef_id, text_format, sequence 
FROM graph_templates_item 
WHERE local_graph_id = ?
ORDER BY sequence
+----------+------+--------+----------------+---------+----------+------+---------+--------------------------------------------------------------------------
| id       | hash | LGTII  | local_graph_id | TII     | color_id | type | cdef_id | text_format
+----------+------+--------+----------------+---------+----------+------+---------+--------------------------------------------------------------------------
| 10145530 |      |   2699 |         549991 |       0 |        0 |    1 |       0 |                       Current        /  Average   /         Maximum
| 10145533 |      |   2699 |         549991 |       0 |        0 |    1 |       0 |                       Current        /  Average   /         Maximum
| 10145535 |      |   2700 |         549991 |       0 |       22 |    7 |       2 | Inbound
| 10145538 |      |   2700 |         549991 | 1112663 |       22 |    7 |       2 | Inbound
| 10145542 |      |   2701 |         549991 |       0 |      230 |    4 |       2 |
| 10145546 |      |   2701 |         549991 | 1112663 |      230 |    4 |       2 |
| 10145550 |      |   2702 |         549991 |       0 |        0 |    9 |       2 |
| 10145554 |      |   2702 |         549991 | 1112663 |        0 |    9 |       2 |
| 10145558 |      |   2703 |         549991 |       0 |        0 |    9 |      19 |
| 10145562 |      |   2703 |         549991 | 1112663 |        0 |    9 |      19 |
| 10145566 |      |   2704 |         549991 |       0 |        0 |    9 |       2 | /
| 10145570 |      |   2704 |         549991 | 1112663 |        0 |    9 |       2 | /
| 10145574 |      |   2705 |         549991 |       0 |        0 |    9 |       2 | /
| 10145580 |      |   2705 |         549991 | 1112663 |        0 |    9 |       2 | /
| 10145588 |      |   2706 |         549991 |       0 |        0 |    9 |      19 |
| 10145593 |      |   2706 |         549991 | 1112663 |        0 |    9 |      19 |
| 10145597 |      |   2707 |         549991 |       0 |        0 |    1 |       0 | / |sum:auto:current:2:auto|
| 10145602 |      |   2707 |         549991 | 1112663 |        0 |    1 |       0 | / |sum:auto:current:2:auto|
| 10145607 |      |   2708 |         549991 |       0 |      294 |    2 |       0 | 95th:
| 10145612 |      |   2708 |         549991 | 1112663 |      294 |    2 |       0 | 95th:
| 10145617 |      |   2709 |         549991 |       0 |        0 |    1 |       0 | |95:bits:6:current:2|
| 10145622 |      |   2709 |         549991 | 1112663 |        0 |    1 |       0 | |95:bits:6:current:2|
| 10145629 |      |   2710 |         549991 |       0 |       20 |    7 |       2 | Outbound
| 10145632 |      |   2710 |         549991 | 1112666 |       20 |    7 |       2 | Outbound
| 10145638 |      |   2711 |         549991 |       0 |       20 |    4 |       2 |
| 10145644 |      |   2711 |         549991 | 1112666 |       20 |    4 |       2 |
| 10145648 |      |   2712 |         549991 |       0 |        0 |    9 |       2 |
| 10145652 |      |   2712 |         549991 | 1112666 |        0 |    9 |       2 |
| 10145656 |      |   2713 |         549991 |       0 |        0 |    9 |      19 |
| 10145660 |      |   2713 |         549991 | 1112666 |        0 |    9 |      19 |
| 10145664 |      |   2714 |         549991 |       0 |        0 |    9 |       2 | /
| 10145668 |      |   2714 |         549991 | 1112666 |        0 |    9 |       2 | /
| 10145673 |      |   2715 |         549991 |       0 |        0 |    9 |       2 | /
| 10145678 |      |   2715 |         549991 | 1112666 |        0 |    9 |       2 | /
| 10145683 |      |   2716 |         549991 |       0 |        0 |    9 |      19 |
| 10145688 |      |   2716 |         549991 | 1112666 |        0 |    9 |      19 |
| 10145694 |      |   2717 |         549991 |       0 |        0 |    1 |       0 | / |sum:auto:current:2:auto|
| 10145700 |      |   2717 |         549991 | 1112666 |        0 |    1 |       0 | / |sum:auto:current:2:auto|
| 10145707 |      |   2718 |         549991 |       0 |      348 |    2 |       0 | 95th:
| 10145714 |      |   2718 |         549991 | 1112666 |      348 |    2 |       0 | 95th:
| 10145721 |      |   2719 |         549991 |       0 |        0 |    1 |       0 | |95:bits:6:current:2|
| 10145729 |      |   2719 |         549991 | 1112666 |        0 |    1 |       0 | |95:bits:6:current:2|
| 10145766 |      | 480834 |         549991 | 1112663 |      442 |    7 |      30 | no data
| 10145772 |      | 480834 |         549991 |       0 |      442 |    7 |      30 | no data
| 10145737 |      |   2720 |         549991 |       0 |        0 |   40 |       0 |
| 10145745 |      |   2720 |         549991 | 1112666 |        0 |   40 |       0 |
| 10145752 |      |   2721 |         549991 |       0 |        0 |    1 |       0 | Speed (Mbit/s): |query_ifHighSpeed|
| 10145759 |      |   2721 |         549991 |       0 |        0 |    1 |       0 | Speed (Mbit/s): |query_ifHighSpeed|
| 10145778 |      | 588071 |         549991 | 1112663 |        0 |    1 |       0 | Interface Alias: |query_ifAlias|
| 10145783 |      | 588071 |         549991 |       0 |        0 |    1 |       0 | Interface Alias: |query_ifAlias|
+----------+------+--------+----------------+---------+----------+------+---------+--------------------------------------------------

now it's sorted , in fact cacti db gives the output in sorted format only , but there is repetition in seq number ,so first row is seq 1 and second row is also seq 1 likewise third row is seq no 2 and fourth row is also seq number 2 , and last two rows are seq number 25 .

gj00354347 commented 3 years ago
SELECT id, hash, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, color_id, graph_type_id type, 
cdef_id, text_format, sequence 
FROM graph_templates_item 
WHERE local_graph_id = 0
AND graph_template_id = ?
MariaDB [cacti]> SELECT id, hash, local_graph_template_item_id LGTII,  
local_graph_id, task_item_id TII, color_id, 
graph_type_id type,  cdef_id, text_format, sequence  
FROM graph_templates_item  
WHERE local_graph_id = 0 
AND graph_template_id = 38;
+--------+----------------------------------+-------+----------------+-----+----------+------+---------+-----------------------------------------------------
| id     | hash                             | LGTII | local_graph_id | TII | color_id | type | cdef_id | text_format
+--------+----------------------------------+-------+----------------+-----+----------+------+---------+-----------------------------------------------------
|   2699 | 21359acea6bf6642efc0ebdf502246a8 |     0 |              0 |   0 |        0 |    1 |       0 |                       Current        /  Average   /
|   2700 | 84b3889d647c576506b669ac3c43113e |     0 |              0 |   2 |       22 |    7 |       2 | Inbound
|   2701 | b983b9361009d24b7cc0852c576903c3 |     0 |              0 |   2 |      230 |    4 |       2 |
|   2702 | 6f289c570dbc2f520f9e14580193286d |     0 |              0 |   2 |        0 |    9 |       2 |
|   2703 | 0b3c25d11c2d3a064828cb009af9a52a |     0 |              0 |   2 |        0 |    9 |      19 |
|   2704 | b4805e1d91c17cebfced4a1238361565 |     0 |              0 |   2 |        0 |    9 |       2 | /
|   2705 | 513e762772545c90a963592befd3794e |     0 |              0 |   2 |        0 |    9 |       2 | /
|   2706 | bf7ba751e6d18545aa83f431aba1b954 |     0 |              0 |   2 |        0 |    9 |      19 |
|   2707 | c8581ace8e0baa050e9d2b08c78e5f9e |     0 |              0 |   2 |        0 |    1 |       0 | / |sum:auto:current:2:auto|
|   2708 | 21318eb12fc4cba40a45189ff780c20f |     0 |              0 |   2 |      294 |    2 |       0 | 95th:
|   2709 | 4d060081a94bcc255ad7ab6e39437b01 |     0 |              0 |   2 |        0 |    1 |       0 | |95:bits:6:current:2|
|   2710 | 085fa2494bdc026044cdb8e2b6028373 |     0 |              0 |   3 |       20 |    7 |       2 | Outbound
|   2711 | 82f66857c148cb7a402a767da5468e50 |     0 |              0 |   3 |       20 |    4 |       2 |
|   2712 | b74e2b209cdefa5d4c885562f3a83839 |     0 |              0 |   3 |        0 |    9 |       2 |
|   2713 | a3b6c385537f1c081dc2c18fbf17346f |     0 |              0 |   3 |        0 |    9 |      19 |
|   2714 | 63726654b5adaf6fc6ecbeb977360387 |     0 |              0 |   3 |        0 |    9 |       2 | /
|   2715 | c5bf123b7498ff5b5ec5c1e2ea2d5826 |     0 |              0 |   3 |        0 |    9 |       2 | /
|   2716 | c51c93e1576ea624f3b140aa7e48a108 |     0 |              0 |   3 |        0 |    9 |      19 |
|   2717 | 83f2b008f1f1e4e8b122ca97936f60d1 |     0 |              0 |   3 |        0 |    1 |       0 | / |sum:auto:current:2:auto|
|   2718 | ca5e015c301f59219f96fea77e7f9059 |     0 |              0 |   3 |      348 |    2 |       0 | 95th:
|   2719 | e77c3b2b70d02e1fb466985baf203abd |     0 |              0 |   3 |        0 |    1 |       0 | |95:bits:6:current:2|
|   2720 | 2fd9c66cb971b1f5531e88c1ae2ae1f6 |     0 |              0 |   3 |        0 |   40 |       0 |
|   2721 | a7c369c02c27084c237f7a83efbf2248 |     0 |              0 |   0 |        0 |    1 |       0 | Speed (Mbit/s): |query_ifHighSpeed|
| 480834 | fe578c5e6f3b1720f808eb865e84cdf5 |     0 |              0 |   2 |      442 |    7 |      30 | no data
| 588071 | fa4e0224302ca97022d4b1ca7dd627d4 |     0 |              0 |   2 |        0 |    1 |       0 | Interface Alias: |query_ifAlias|
+--------+----------------------------------+-------+----------------+-----+----------+------+---------+-----------------------------------------------------

and yes there is one more column seq id from 1 through 25 .

TheWitness commented 3 years ago

Can you edit that and upload similar to what you did before so I can read it?

gj00354347 commented 3 years ago

editing done let me know whether info is ok ?

TheWitness commented 3 years ago

Can you revise and add ORDER BY sequence for me. I edited your submission to add the formatting.

TheWitness commented 3 years ago

One more query to run:

SELECT id, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, sequence 
FROM graph_templates_item 
WHERE local_graph_id = ?
AND task_item_id = 0
AND CONCAT(id,',',local_graph_template_item_id) IN(
   SELECT id FROM (
      SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals 
      FROM graph_templates_item 
      WHERE local_graph_id = ?
      GROUP BY local_graph_template_item_id
      ORDER BY task_item_id
      HAVING totals = 2
   )
)
ORDER BY sequence
TheWitness commented 3 years ago

Those "should" be the ones to remove, but need you to verify. Still tweaking that query. Might change a few times.

TheWitness commented 3 years ago

Okay, made a minor but important change.

TheWitness commented 3 years ago

One more change 👍

gj00354347 commented 3 years ago

Can you revise and add ORDER BY sequence for me. I edited your submission to add the formatting.

now it's sorted , in fact cacti db gives the output in sorted format only , but there is repetition in seq number ,so first row is seq 1 and second row is also seq 1 likewise third row is seq no 2 and fourth row is also seq number 2 , and last two rows are seq number 25 .

gj00354347 commented 3 years ago
SELECT id, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, sequence 
FROM graph_templates_item 
WHERE local_graph_id = ?
AND task_item_id = 0
AND CONCAT(id,',',local_graph_template_item_id) IN(
   SELECT id FROM (
      SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals 
      FROM graph_templates_item 
      WHERE local_graph_id = ?
      GROUP BY local_graph_template_item_id
      ORDER BY task_item_id
      HAVING totals = 2
   )
)
ORDER BY sequence

this was giving some error so changed the place of ORDER BY after HAVING

MariaDB [cacti]> SELECT id, local_graph_template_item_id LGTII, local_graph_id, task_item_id TII, sequence FROM graph_templates_item WHERE local_graph_id = xxxxxx AND task_item_id = 0 AND CONCAT(id,',',local_graph_template_item_id) IN(SELECT id FROM (SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals FROM graph_templates_item WHERE local_graph_id = xxxxxx GROUP BY local_graph_template_item_id ORDER BY task_item_id HAVING totals = 2 )) ORDER BY sequence; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'HAVING totals = 2 )) ORDER BY sequence' at line 1

then changed the query and put this but again it has some error

MariaDB [cacti]> SELECT id, local_graph_template_item_id LGTII, local_graph_id, task_item_id TII, sequence FROM graph_templates_item WHERE local_graph_id = xxxxxx AND task_item_id = 0 AND CONCAT(id,',',local_graph_template_item_id) IN(SELECT id FROM (SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals FROM graph_templates_item WHERE local_graph_id = xxxxxx GROUP BY local_graph_template_item_id HAVING totals = 2 ORDER BY task_item_id )) ORDER BY sequence; ERROR 1248 (42000): Every derived table must have its own alias

TheWitness commented 3 years ago
SELECT id, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, sequence 
FROM graph_templates_item 
WHERE local_graph_id = ?
AND task_item_id = 0
AND CONCAT(id,',',local_graph_template_item_id) IN(
   SELECT id FROM (
      SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals 
      FROM graph_templates_item 
      WHERE local_graph_id = ?
      GROUP BY local_graph_template_item_id
      HAVING totals = 2
      ORDER BY task_item_id
   ) AS rs
)
ORDER BY sequence
gj00354347 commented 3 years ago
SELECT id, local_graph_template_item_id LGTII, 
local_graph_id, task_item_id TII, sequence 
FROM graph_templates_item 
WHERE local_graph_id = ?
AND task_item_id = 0
AND CONCAT(id,',',local_graph_template_item_id) IN(
   SELECT id FROM (
      SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals 
      FROM graph_templates_item 
      WHERE local_graph_id = ?
      GROUP BY local_graph_template_item_id
      HAVING totals = 2
      ORDER BY task_item_id
   ) AS rs
)
ORDER BY sequence

here is the output , yes in place of ? i used the graph-id of affected graphs (the one which you see at end of url when you wre viewing a graph)

| id       | LGTII | local_graph_id | TII | sequence |
+----------+-------+----------------+-----+----------+
| 10145530 |  2699 |         549991 |   0 |        1 |
| 10145535 |  2700 |         549991 |   0 |        2 |
| 10145542 |  2701 |         549991 |   0 |        3 |
| 10145550 |  2702 |         549991 |   0 |        4 |
| 10145558 |  2703 |         549991 |   0 |        5 |
| 10145566 |  2704 |         549991 |   0 |        6 |
| 10145574 |  2705 |         549991 |   0 |        7 |
| 10145588 |  2706 |         549991 |   0 |        8 |
| 10145597 |  2707 |         549991 |   0 |        9 |
| 10145607 |  2708 |         549991 |   0 |       10 |
| 10145617 |  2709 |         549991 |   0 |       11 |
| 10145629 |  2710 |         549991 |   0 |       12 |
| 10145638 |  2711 |         549991 |   0 |       13 |
| 10145648 |  2712 |         549991 |   0 |       14 |
| 10145656 |  2713 |         549991 |   0 |       15 |
| 10145664 |  2714 |         549991 |   0 |       16 |
| 10145673 |  2715 |         549991 |   0 |       17 |
| 10145683 |  2716 |         549991 |   0 |       18 |
| 10145694 |  2717 |         549991 |   0 |       19 |
| 10145707 |  2718 |         549991 |   0 |       20 |
| 10145721 |  2719 |         549991 |   0 |       21 |
| 10145737 |  2720 |         549991 |   0 |       23 |
| 10145752 |  2721 |         549991 |   0 |       24 |
+----------+-------+----------------+-----+----------+
23 rows in set (0.00 sec)
gj00354347 commented 3 years ago

yes one more question , I Know it's not the right approach like can we delete those extra items from the graphs , i know they can come again later if the code is not fixed .

so just asking like can we do it in some safe way , like firing some sql query and deleting those extra items and ensuring that this does not affect the other parts of cacti .

TheWitness commented 3 years ago

Well, make sure all automation is disabled, then make a backup copy of your graph_templates_item table as follows:

mysqldump cacti graph_templates_item > emergency_backup.sql

Then, run this SQL:

DELETE 
FROM graph_templates_item 
WHERE local_graph_id = ?
AND task_item_id = 0
AND CONCAT(id,',',local_graph_template_item_id) IN(
   SELECT id FROM (
      SELECT CONCAT(id, ',', local_graph_template_item_id) AS id, COUNT(*) AS totals 
      FROM graph_templates_item 
      WHERE local_graph_id = ?
      GROUP BY local_graph_template_item_id
      HAVING totals = 2
      ORDER BY task_item_id
   ) AS rs
)

Then, let me know if the graph still works. If it's broken, just reverse the process:

mysql cacti < emergency_backup.sql
TheWitness commented 3 years ago

Reverse the process even if it's not broken as we want to continue to use this as our test case.

TheWitness commented 3 years ago

Please follow my instructions. You still have more to do, it would be nice to get it done this weekend.

gj00354347 commented 3 years ago

Thanks Larry for the delete statements and backup procedures.

but as you suggested that we would have to revert the changes as this particular case has to serve as a test case . So I would prefer not to do the backup and delete procedures as we have to revert again , no matter what is result. And yes it’s production box so we have to follow the rules of Change management as well like raising and CR and getting it approved by Change Approval Board . So if you do not mind I would like to skip this step .

Thank you Gopal

gj00354347 commented 3 years ago

Hi Larry,

can we schedule a meeting today or may be tomorrow and I can come in screen sharing mode with you and you can debug and reproduce the issue as it has become a blocker for our clean up task . Or May be I can give you a whole dump after cleaning all ups and snmp comm string and other sensitive information. Please let me know your thoughts.

TheWitness commented 3 years ago

Once you have a working test environment with the issue present, I'll spend some time. In the mean time, there is nothing for me to do here.

gj00354347 commented 3 years ago

Hi Larry ,

I tried to sync my prod with test . so I took the backup of whole db

step 1. mysqldump -u username dbname-prod > prod-dump.sql

and then imported the whole into test db .

step 2 . cat prod-dump.sql |mysql -u username -p -D dbname-Test

but the problem I am facing here is that when db from prod to test is ported , it ports even the information about pollers and devices and their association with pollers which are actually from prod .

and the poller info of test gets overwritten with that of prod . SO the first thing I did after porting the prod db to test is to disable all the devices in test to avoid being polled from other devices .

step 3 . update host set disabled='on';

now I wanted that all the devices which were reachable from Test , they should be enabled and polled from test env servers . so I inserted poller info in poller table .

step 4. insert into poller (id,name) values (random-number ,testserver);

then transfer all the devices to testerver1 poller .

step 5. update host set poller_id=random-number;

then enable those devices in test which were up before this migration .

step 6 .update host set disabled='' where id in (list if ids separated by comma);

But the challenge here is that , testserver1 is added as remote poller not as main poller .so how to make it main poller .

I tried some other methods like .

I did the installation on test from fresh assuming that db will remain same and testserver1 will be registered as main server (as it happens in case of upgrade)but that did not work and broke all connection in db and i had to repeat the db restoration to cacti Test db.

can you please help me here in bringing my prod in sync with test .

many thanks in advance gopal

TheWitness commented 3 years ago

On your test system, you don't even need to enable the poller, we just need to have a snapshot of the database and the RRDfiles copied over for the testing.

I'll open a block of time on Friday if you have things ready to go. Just don't enable the poller and this will be easy.