ikzelf / zbxdb

Zabbix database monitoring, the easy and extendable way
GNU General Public License v3.0
94 stars 45 forks source link

zbxdb - connect no data #50

Closed rmdolezal closed 4 years ago

rmdolezal commented 4 years ago

Describe the bug from time to time some zbxdb connections to oracle db freezes with the message connect no data at zabbix monitor. From zbx proxy looks all good, processes are running but no data are received. The only solution is kill the desired process a wait to be started again.

PROBLEM srvzrorgDB01-SKOLMMCP zbxdb Connect nodata srvzrorgDB01-SKOLMMCP since 30m

To Reproduce Steps to reproduce the behavior: 1. zbxdb.srvzrorgDB01-SKOLMMCP.odb.cfg

[zbxdb] db_url = //x.x.x.x:1522/SKOLMMCP username = password = db_type = oracle db_driver = cx_Oracle instance_type=rdbms role = normal out_dir = $HOME/zbxora_out hostname = srvzrorgDB01-SKOLMMCP checks_dir = $HOME/etc/zbxdb_checks site_checks=primary.12 password_enc =

  1. python-3.6.5 zbxdb-2.07

  2. oracle 12c

  3. Linux proxy-test-final 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

  4. n/a

zbxdb log contains only this for all freezed processes, but similar times:

SERVER1 2020-08-07 15:34:04,023main20_connect 2 times, 0 fail; started 778180 queries, 0 fail memrss:28260 user:325.820750 sys:145.189229 2020-08-07 16:34:03,615main20_connect 2 times, 0 fail; started 778990 queries, 0 fail memrss:28260 user:326.171351 sys:145.336411 2020-08-07 17:34:03,862main20_connect 2 times, 0 fail; started 779794 queries, 0 fail memrss:28260 user:326.514880 sys:145.490424 2020-08-07 18:02:19,626main30_zbxdb cancel_sql checks_01m scn 2020-08-07 18:02:19,627main30_zbxdb canceled checks_01m scn

SERVER2 2020-08-07 16:27:31,140main20_connect 3 times, 0 fail; started 1298857 queries, 1 fail memrss:28184 user:567.861537 sys:257.641800 2020-08-07 17:27:31,232main20_connect 3 times, 0 fail; started 1299661 queries, 1 fail memrss:28184 user:568.213330 sys:257.804576 2020-08-07 18:06:44,144main30_zbxdb cancel_sql checks_05m u_ts 2020-08-07 18:06:44,145main30_zbxdb canceled checks_05m u_ts

SERVER 3 2020-08-07 16:27:29,233main20_connect 2 times, 0 fail; started 1298930 queries, 0 fail memrss:28216 user:565.671102 sys:260.828943 2020-08-07 17:27:29,597main20_connect 2 times, 0 fail; started 1299734 queries, 0 fail memrss:28216 user:566.019291 sys:260.980762 2020-08-07 18:00:06,591main30_zbxdb cancel_sql checks_01m sysstat 2020-08-07 18:00:06,592main30_zbxdb canceled checks_01m sysstat

ikzelf commented 4 years ago

Hi, this is a nasty problem and for some reason it looks hard to solve. I tend to forget to update the version string. do you happen to have the commit # available from when you last pulled zbxdb.py? What seems to happen is that the queries reach a timeout. For some I can imagine that (like the u_ts) for getting the scn and sysstat should normally never reach a timeout and in your system this does happen.

Normally in Oracle after a cancel the connection is freed and the processing can continue..... weird.

It also seems to happen at around the same time. Could this have a relation with some type of backup activity?

rmdolezal commented 4 years ago

Hi , i do not know if I understand well - but the version of zbxdb.py is 2.07, the same version is also in the log file. The relation with backups can be possible, I must ask for the backup times. I will let you know. Thank you Regards Marek

ikzelf commented 4 years ago

Hi Marek, when did you download zbxdb.py? (I forgot to update the version many times so it could help to know when you downloaded it)

Thanks, Ronald.

rmdolezal commented 4 years ago

Aha ☺ ok

It probably was in April 16 2020.

Marek

S pozdravem / Best regards Marek Doležal

[https://portal.totalservice.cz/AV_logo_all.jpg]

senior system administrator M: +420 724 035 531 mdolezal@totalservice.czmailto:mdolezal@totalservice.cz

TOTAL SERVICE a.s.

U Uranie 954/18 T: +420 270 002 811 170 00 Praha 7 Czech Republic www.totalservice.czhttp://www.totalservice.cz

From: Ronald Rood notifications@github.com Sent: Wednesday, August 26, 2020 08:29 To: ikzelf/zbxdb zbxdb@noreply.github.com Cc: Doležal Marek MDolezal@totalservice.cz; Author author@noreply.github.com Subject: Re: [ikzelf/zbxdb] zbxdb - connect no data (#50)

Hi Marek, when did you download zbxdb.py? (I forgot to update the version many times so it could help to know when you downloaded it)

Thanks, Ronald.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ikzelf/zbxdb/issues/50#issuecomment-680686041, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM27LRTOW56DGVV25ZWKFGTSCSTSNANCNFSM4QKPYU4A.

ikzelf commented 4 years ago

Hi Marek,

although I am not sure if it would fix your problem, I did make some changes during that time that does report data. so a git pull certainly introduces some improvements. What I wonder is, when the problem occurs: 1) are the *.zbx files generated in your out_dir? 2) is the database session still connected and if so, what state is it in?

Thanks, Ronald - who hopes to be able to fix this.

rmdolezal commented 4 years ago

Hi Ronald,

It looks like there were no data generated in beween freeze and restart – regarding the content of arch files form 23.8. for example. I will try to download the latest version of zbxdb.py.

Regarding point2 – I do not know what you exactly mean – when zabbix is reporting connect no data then I`ll kill the session immediately and then the session is Automatically restarted by the starter

Best regards

Marek S pozdravem / Best regards Marek Doležal

[https://portal.totalservice.cz/AV_logo_all.jpg]

senior system administrator M: +420 724 035 531 mdolezal@totalservice.czmailto:mdolezal@totalservice.cz

TOTAL SERVICE a.s.

U Uranie 954/18 T: +420 270 002 811 170 00 Praha 7 Czech Republic www.totalservice.czhttp://www.totalservice.cz

From: Ronald Rood notifications@github.com Sent: Wednesday, August 26, 2020 12:27 To: ikzelf/zbxdb zbxdb@noreply.github.com Cc: Doležal Marek MDolezal@totalservice.cz; Author author@noreply.github.com Subject: Re: [ikzelf/zbxdb] zbxdb - connect no data (#50)

Hi Marek,

although I am not sure if it would fix your problem, I did make some changes during that time that does report data. so a git pull certainly introduces some improvements. What I wonder is, when the problem occurs:

  1. are the *.zbx files generated in your out_dir?
  2. is the database session still connected and if so, what state is it in?

Thanks, Ronald - who hopes to be able to fix this.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ikzelf/zbxdb/issues/50#issuecomment-680796451, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM27LRVIHOISENAHRXXRBRLSCTPRBANCNFSM4QKPYU4A.

ikzelf commented 4 years ago

Hi Marek,

does the zbxdb.py still have the Oracle connection when the problem occurs? (You can find this to check gv$session (or have a dba do this for you))

rmdolezal commented 4 years ago

Hi Ronald,

I must observe if this problem will happen again. Now I installed the latest zbxdb.py update, so I will see if the problem re-appear.

Thank you for you outstanding support

Best regards

Marek

From: Ronald Rood notifications@github.com Sent: Thursday, August 27, 2020 2:26 PM To: ikzelf/zbxdb zbxdb@noreply.github.com Cc: Doležal Marek MDolezal@totalservice.cz; Author author@noreply.github.com Subject: Re: [ikzelf/zbxdb] zbxdb - connect no data (#50)

Hi Marek,

does the zbxdb.py still have the Oracle connection when the problem occurs? (You can find this to check gv$session (or have a dba do this for you))

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ikzelf/zbxdb/issues/50#issuecomment-681916647, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM27LRXMDOVO6AAHMMDSOELSCZGHJANCNFSM4QKPYU4A.

rmdolezal commented 4 years ago

Hi Ronald, to this time the problem does not appeared again. So I hope that the problem is gone. Thank you again for you support.

Best regards

Marek