Closed absz1980 closed 4 years ago
Yes, you need the zabbix agent to read the log.xml file. Python needed on dbhost for this.
It sends the discovery list of the found log.xml to the zabbix proxy or server of your choice.
Did you get this working?
Hello, Ronald. I am just installing zabbix agent on host.
What version python do I install on monitoring host?
Do I have to install python on behalf of zabbix user?
Can I use pyenv?
Hi, yes, you can use pyenv. The version of python is not very specific but I would just pick a recent v3. It might be smartest to make the script run by a user who has access to all instances and the v$diag_info view. OS authentication is what I would use for this.
Hello, Ronald!
zbx_alertlog.py[-h][-o ORATAB][-s SERVERNAME][-p PORT] - H HOSTNAME - k KEY
ORATAB - /etc/oratab SERVERNAME,PORT - Is it zabbix server host,port? HOSTNAME - It is monitoring hostname in zabbix? KEY - What is it? Do I need to install zabbix-sender? Do I need add zbx_alertlog.py to crontab?
Good morning,
ORATAB - /etc/oratab SERVERNAME, PORT - Zabbix server/proxy HOSTNAME - the hostname that is going to monitor the log.xml files (the database server (add template zbx_alertlog)) KEY - the key of the discovery rule. If you use the template zbx_alertlog then it is oracle.alertlog.lld
Yes, you need zabbix-sender Need to be in the crontab is not a rule but if you have some dynamics in your environment like adding/removing database instances it would be smart to put it in the crontab.
Hello, I run command and retrieve error
[oracle]$ /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld
Namespace(hostname='MSK01-AISDB02', key='oracle.alertlog.lld', oratab='/etc/oratab', port='10051', servername='sysmon.uralsibins.ru')
Linux
new sid AIS
Traceback (most recent call last):
File "/home/oracle/zbxdb/bin/zbx_alertlog.py", line 172, in
SQL> show parameter name
NAME TYPE VALUE
cell_offloadgroup_name string db_file_name_convert string db_name string AISREPRT db_unique_name string AISREPRT global_names boolean FALSE instance_name string AIS_REPORT lock_name_space string log_file_name_convert string /oracle/oradata/ais/redo, /ora cle/oradata/FRA/AIS_REPORT/onl inelog processor_group_name string service_names string ais_report SQL>
The problem was that you have a "_" in your ORACLE_SID. This is fixed now.
Hello, Ronald!
[oracle@ zbxdb]$ git pull remote: Enumerating objects: 3, done. remote: Counting objects: 100% (3/3), done. remote: Total 3 (delta 1), reused 3 (delta 1), pack-reused 0 Unpacking objects: 100% (3/3), done. From https://github.com/ikzelf/zbxdb a1256b5..37d5edb master -> origin/master Updating a1256b5..37d5edb Fast-forward zbx_alertlog.lld | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 zbx_alertlog.lld
[oracle@ zbxdb]$ /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld
Namespace(hostname='MSK01-AISDB02', key='oracle.alertlog.lld', oratab='/etc/oratab', port='10051', servername='sysmon.uralsibins.ru')
Linux
new sid AIS
Traceback (most recent call last):
File "/home/oracle/zbxdb/bin/zbx_alertlog.py", line 172, in
Weird, the changes I made -and tested- are no longer here. Just recovered them. Sorry for that.
Hi, Ronald!
The error was not fixed.
[oracle zbxdb]$ git pull
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Total 4 (delta 3), reused 4 (delta 3), pack-reused 0
Unpacking objects: 100% (4/4), done.
From https://github.com/ikzelf/zbxdb
37d5edb..3ae2b00 master -> origin/master
Updating 37d5edb..3ae2b00
Fast-forward
bin/zbx_alertlog.py | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
[oracle zbxdb]$ /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld
Namespace(hostname='MSK01-AISDB02', key='oracle.alertlog.lld', oratab='/etc/oratab', port='10051', servername='sysmon')
Linux
new sid AIS
Traceback (most recent call last):
File "/home/oracle/zbxdb/bin/zbx_alertlog.py", line 176, in
Something funny happens when I move my code to my git environment. I hope it is better now. What should happen is something like ps -ef|grep ora_ and pick the 3rd part of the name (the part after orasmon).... Match that name with the ORACLE_SID in de oratab and grab the ORACLE_HOME to use for the connection to the instance.
What's it next?
I see nothing in latest data in zabbix server?
[oracle zbxdb]$ /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld Namespace(hostname='MSK01-AISDB02', key='oracle.alertlog.lld', oratab='/etc/oratab', port='10051', servername='sysmon') Linux new sid AIS_REPORT sid:AIS_REPORT oh:/oracle/product/11.2.0.4 {"data":[{"{#INSTANCE_NAME}": "AIS_REPORT", "{#ALERTLOG}": "/oracle/diag/rdbms/aisreprt/AIS_REPORT/alert/log.xml"}]} zabbix_sender [5254]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 1; total: 1; seconds spent: 0.000129"}] Response from "sysmon": "processed: 0; failed: 1; total: 1; seconds spent: 0.000129" sent: 1; skipped: 0; total: 1
Does your zabbix host "MSK01-AISDB02" have the alertlog template attached to it? If so, is there an error in that discovery rule (oracle.alertlog.lld)? The discovery array looks good to me, if this machine indeed has only one running instance on it.
I deployed zbx_template_alertlog.xml and attached on MSK01-AISDB02. I run /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld many times. And I see in Latest Data nothing. I seem don`t understand something.
Ah, OK. Did you see this: zabbix_sender [5254]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 1; total: 1; seconds spent: 0.000129"}] This means 1 item is sent and that one item also failed. Did you have a run with processed: 1 ?
If you did get a discovery processed, it should have generated alertlog items on your host, or an error message visible in configuration/host/discovery rules.
I have
ah, I forgot about the type of the oracle.alertlog.lld key. Change that to trapper. Initially I had it setup as a user parameter but that did not work so nice with Oracle's ideas about security. (what do you get when you hover above the [I] ?)
I changed to Zabbix trapper and ran /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld many times. and
Initially I had it setup as a user parameter but that did not work so nice with Oracle's ideas about security. (what do you get when you hover above the [I] ?) Unsupported item key
Ok. Normally the sender should return without errors. This should be done once, or how often you want. This merely declares the alertlog items.
Those generated items will be of type zabbix active agent. The zabbix agent will begin reading the log.xml after that.
Do you have a zabbix proxy? If so, run a proxy configuration update before discovering the alertlogs again(or just wait an hour)
Do you have a zabbix proxy?
I don`t have zabbix proxy.
How do I check correct setting? Can I configure with zabbix agent or crontab only?
What bugs me is your last image, showing active agent for oracle.alertlog.lld. This should be changed to zabbix trapper in the template. Then unlink and clear the template and link it again. This should leave the discovery key in a valid state. Next run the discovery script. Once. Check the output (hoping for processed: 1) If failed: check the error in the discovery rule.
If no error, you should have your 5 items and 12 triggers.
Hi, Ronald! Thanks. It`s work!
Linux new sid AIS_REPORT sid:AIS_REPORT oh:/oracle/product/11.2.0.4 {"data":[{"{#INSTANCE_NAME}": "AIS_REPORT", "{#ALERTLOG}": "/oracle/diag/rdbms/aisreprt/AIS_REPORT/alert/log.xml"}]} zabbix_sender [5118]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000180"}] Response from "sysmon.uralsibins.ru:10051": "processed: 1; failed: 0; total: 1; seconds spent: 0.000180" sent: 1; skipped: 0; total: 1
Should I add crontab or configure zabbix agent somehow?
Glad that it is working now. By now your log item[s] will be created and the agent started reading the log.xml file, if it can access it. You will need to fiddle with access rights to make the log.xml readable by the agent. So you might want to make the agent member of the group that owns the log.xml file and make sure that this file is readable by the group.
If your environment has some dynamics in it, it would be wise to put the zbx_alertlog.py in the oracle crontab to make sure that new -or removed- instances are recognised.
Check config/host/ oracle application in zabbix for problems with the log item. If no problems exist, check latest data.
I check error and alert in is not visible
[oracle~]$ cat test-alert.log msg time='2020-07-09T02:00:00.010+03:00' org_id='oracle' comp_id='rdbms' client_id='' type='UNKNOWN' level='16' host_id='MSK01-AISDB02' host_addr='10.175.33.122' module='' pid='61518'>
/msg [oracle@ ~]$ cat test-alert.log >> /oracle/diag/rdbms/aisreprt/AIS_REPORT/alert/log.xml [oracle@ ~]$ cat test-alert.log >> /oracle/diag/rdbms/aisreprt/AIS_REPORT/alert/log.xml [oracle@ ~]$ /home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon -p 10051 -H MSK01-AISDB02 -k oracle.alertlog.lld Namespace(hostname='MSK01-AISDB02', key='oracle.alertlog.lld', oratab='/etc/oratab', port='10051', servername='sysmon') Linux new sid AIS_REPORT sid:AIS_REPORT oh:/oracle/product/11.2.0.4 {"data":[{"{#INSTANCE_NAME}": "AIS_REPORT", "{#ALERTLOG}": "/oracle/diag/rdbms/aisreprt/AIS_REPORT/alert/log.xml"}]} zabbix_sender [7970]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000088"}] Response from "sysmon.uralsibins.ru:10051": "processed: 1; failed: 0; total: 1; seconds spent: 0.000088" sent: 1; skipped: 0; total: 1 [oracle@ ~]$
Check your alertlog item. I suspect that it is invalid because the agent can not read the log.xml. If it is valid: check latest data and see the lines popping up.
Should I add zbx_alertlog.py to crontab or configure zabbix agent somehow?
you can put zbx_alertlog.py in the oracle crontab, depending on how often the number of databases changes on your server[s] 1 or more times / day. The zbx_alertlog.py only plays a part in the low level discovery of the log.xml files on a host. It does not do anything with the contents of the log.xml.
For that, log items are generated in your host under de application "oracle". Look for unsupported items and check the reason for being unsupported.
You need to configure the log.xml and the path to it such that the agent (running as zabbix user) can read it.
How can I configure the agent? Is it an example?
If the agent is running, it is running. Nothing to do there. What is the status of the generated items on your host? Are the valid and enabled?
Let me see if I understand this. Agent is running and enabled. Nothing to do there.
The status of the generated items on your host
Should I add to crontab zbx_alertlog.py 1 time per 10 min only? How can I check(simulate) triggered problems?
What is in your log.xml? Your most important item is this regard is AIS_REPORT[alertlog] The lines in your /oracle/diag/rdbms/aisreprt/AIS_REPORT/alert/log.xml that match have matches with (ORA-|time=|ERROR|Dead) will be uploaded to zabbix in the AIS_REPORT[alertlog] item.
Those lines will be visible in latest data.
If you expect that new database instances are introduces multiple times per day, it would make sense to put zbx_alertlog.py in the crontab, for a few times per day. zbx_alertlog.py does not read the log.xml, it only is the tool to discover which log.xml's exist on your host. The active agent is reading the log.xml and is doing so every minute.
I got it.
I add to crontab
Cut test alert,
How can I create a problem if agent don't receive the latest data after a while? How can I close the problem? Is it manual close only?
I would change this "$HOME/log/zbxdb_starter.cron 2>&1" to "$HOME/log/zbx_alertlog.cron 2>&1"
The idea is that the events auto close, after a while. That is why the lines with "time=" are for. This makes sure that every now and then new lines arrive and zabbix can see/reevaluate the age of the event and close if it is older. As long as the error keeps repeating in de log.xml, the event will remain open.
The agent nodata event is a default in the active agent templates. There is no need to set this up for the alertlog monitoring.
I would change this "$HOME/log/zbxdb_starter.cron 2>&1" to "$HOME/log/zbx_alertlog.cron 2>&1" Thanks.
I wait more 3 hours. The event hasn`t been auto close yet.
Hello, Ronald After 2day I see on one host
I think the agent doesn`t permission on log.xml file but I check. The permissions are the same as other hosts.
If zbx_alertlog.py runs, it does set the access permissions to 744, world and group readable. Is your logitem is valid and enabled, or is it not supported? If it is not supported, hoovering above the status flag will show the reason.
An other option is that your database is very quiet. What are the last few lines in the log.xml? Normally, all my databases are running archivelog mode and I make sure they archive 4 times in an hour, at least. Doing so causes lox.xml entries that are read. Problem is that if a logitem does not receive data, triggers with a time in it can not always evaluate their status. If this is the case, a simple 'alter system switch logfile;' should solve this. If this helps, make sure this happens about every 15m.
Did you get this up and running now?
[root@MSK01-AISDB01 ~]# su - zabbix su: warning: cannot change directory to /var/lib/zabbix: No such file or directory -bash-4.1$ ls -l /oracle/diag/rdbms/ais/ais/alert/log log_1.xml log.xml -bash-4.1$ ls -l /oracle/diag/rdbms/ais/ais/alert/log.xml -rwxr--r-- 1 oracle oinstall 7608774 Jul 23 14:15 /oracle/diag/rdbms/ais/ais/alert/log.xml -bash-4.1$ tail -f /oracle/diag/rdbms/ais/ais/alert/log.xml
<msg time='2020-07-23T14:15:56.448+03:00' org_id='oracle' comp_id='rdbms' client_id='' type='UNKNOWN' level='16' host_id='MSK01-AISDB01' host_addr='10.175.32.235' module='' pid='2288'>
^C
[oracle@MSK01-AISDB01:ais ~]$ . $HOME/.bash_profile;/home/oracle/zbxdb/bin/zbx_alertlog.py -o /etc/oratab -s sysmon.uralsibins.ru -p 10051 -H MSK01-AISDB01 -k oracle.alertlog.lld > $HOME/log/zbxdb_alertlog.cron 2>&1 [oracle@MSK01-AISDB01:ais ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Jul 23 14:35:55 2020
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> alter system switch logfile;
System altered.
SQL> alter system switch logfile;
And
The issue is resolved during a couple of minutes. Thank you.
Hello, Ronald. I want to configure zbx_alertlog.py. I don`t understand how to do it.
Do I need to install zabbix agent on DB host? What is it installed python version on DB host? Do I need to clone zbxdb projects on DB hosts?