thegooglecodearchive / mycheckpoint

Automatically exported from code.google.com/p/mycheckpoint
0 stars 1 forks source link

Multiple issues deploying remote monitoring #16

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm having issues trying remote monitoring to work either in r174 or r190.

I'm trying a basic setup, 2 production databases and one monitoring database 
which would store the various monitorings.

Either cases 1 or 2 of the manual didn't work for me (from monitored to 
monitoring, or from monitoring to monitored).

use case 1. from the production database (sql1) to the monitoring database 
(sql-monitor).

[root@sql1 mycheckpoint-190]# mycheckpoint --host=sql-monitor --user=dbmonitor 
--password=mypassword --port=3306 --monitored-host=localhost 
--monitored-socket=/var/lib/mysql/mysql.sock --database=mycheckpoint_sql1 
--skip-defaults -v --debug
-- mycheckpoint rev 190, build 201009020925. Copyright (c) 2009-2010 by Shlomi 
Noach
-- database is mycheckpoint_sql1
-- monitored host is: localhost
-- monitored host credentials undefined; using write host credentials
-- Global status & variables recorded
-- Master and slave status recorded
-- OS CPU info recorded
-- OS load average info recorded
-- OS mem info recorded
-- OS mountpoints info recorded
-- OS page io activity recorded
-- New entry added
-- Collecting custom data
(1146, "Table 'mycheckpoint_sql1.custom_query' doesn't exist")
Traceback (most recent call last):
  File "/usr/bin/mycheckpoint", line 4407, in ?
    collect_custom_data()
  File "/usr/bin/mycheckpoint", line 1291, in collect_custom_data
    for custom_query in get_rows(query):
  File "/usr/bin/mycheckpoint", line 306, in get_rows
    cursor.execute(query)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
ProgrammingError: (1146, "Table 'mycheckpoint_sql1.custom_query' doesn't exist")
--
-- Make sure you have executed mycheckpoint with 'deploy' after last 
install/update.upgrade
--  If not, run again with same configuration, and add 'deploy'. e.g.:
--  mycheckpoint --host=my_host deploy

If I check, the given table is existing on sql-monitor, though empty:
mysql> select * from mycheckpoint_sql1.custom_query;
Empty set (0.00 sec)

If I skip-custom, I get same error but on different table:
ProgrammingError: (1146, "Table 'mycheckpoint_sql1.alert_condition' doesn't 
exist")

From my point of view, looks like mycheckpoint is trying to do 
custom_collect_data() on localhost, hence the table not existing.

Eventually, it worked with both --skip-custom and --skip-alerts parameters.

Also I had some issues while the prod used r174 and the schema was r190. Maybe 
a warning could be in order like something saying that the client version is 
inferior to the server (I don't know if you store the version number somewhere 
in the database, but it could definitely be a plus!)

Thanks in advance, cause I'd really like to have the alert feature back up!

G.

Original issue reported on code.google.com by dragonsc...@gmail.com on 8 Sep 2010 at 3:33

GoogleCodeExporter commented 9 years ago
Will look into this during the following days.
Just as a quick note: the revision *is* stored in the database, in the 
"metadata" table, and is checked each time mycheckpoint is invoked.
In fact, the way mycheckpoint upgrades is by comparing the revision of the 
script with the revision of the schema, upgrading the schema.
Will look more into that as well.

Original comment by shlomi.n...@gmail.com on 9 Sep 2010 at 3:36

GoogleCodeExporter commented 9 years ago
Hi

First of all: good work! I just tried this application out for the first time 
and found it really useful.

I think the issue here is that the monitoring connection is used when the write 
connection should be used (since these lookups are made to tables in the 
mycheckpoint data repos and not on the monitored server).

Without having looked to closely at this I suggest the following solution:
 - On line 1292, connect explicitly passing write_conn to get_rows
   1292     for custom_query in get_rows(query, write_conn):
 - On line 1788, same thing
   1788     row = get_row(query, write_conn)
Otherwise, in get_rows, the monitored_connection will be used:
    302 def get_rows(query, connection=None):
    303     if connection is None:
    304         connection = monitored_conn
...

After these changes it worked for me, but maybe this brakes something else...

Kindest regards
Olle

Original comment by noj.nils...@gmail.com on 9 Sep 2010 at 8:15

GoogleCodeExporter commented 9 years ago
@Olle,

Your diagnostics was accurate, except replace "1788" with "1690" (but that's 
just a typo :) )

Original comment by shlomi.n...@gmail.com on 14 Sep 2010 at 8:54

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Fixed in revision 192 (not yet released)

Original comment by shlomi.n...@gmail.com on 14 Sep 2010 at 8:56

GoogleCodeExporter commented 9 years ago
Revision 192 released. I'd appreciate your feedback

Original comment by shlomi.n...@gmail.com on 14 Sep 2010 at 8:58