ikzelf / zbxora

Zabbix Oracle monitoring plugin - replaced by zbxdb
42 stars 21 forks source link

Issues installing and getting zbxdb to start #23

Closed gianri closed 3 years ago

gianri commented 3 years ago

Hi @ikzelf, I'm not very experienced in Linux and in Zabbix, but I'm having issues in setting up zbxdb. I followed "getting_started.md" document apparently without particular problems, but zbxdb doesn't work. When I try to run (manually) the first row added in crontab (. /home/zbxdb/.bash_profile;$HOME/zbxdb/bin/zbxdb_starter > /home/zbxdb/log/zbxdb_starter.cron 2>&1) I get this error in the log file "zbxdb_starter.log":

Traceback (most recent call last): File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 22, in from cryptography.fernet import Fernet, InvalidToken ModuleNotFoundError: No module named 'cryptography'

I installed zbxdb on a dedicated Server with this environment: OS: Ubuntu 20.04 $HOME = /home/zbxdb $ZBXDB_HOME = /home/zbxdb My existing paths: /home/zbxdb /home/zbxdb/etc/ /home/zbxdb/log/ /home/zbxdb/zbxdb_out/ (it's empty) /home/zbxdb/zbxdb_sender/ (it's empty)

/home/zbxdb/zbxdb/ /home/zbxdb/zbxdb/etc/ /home/zbxdb/zbxdb/doc/

Running "pip list" from /home/zbxdb I get this: zbxdb@zbxdb:~$ pip list Package Version


pip 20.2.3 setuptools 49.2.1

Running "pip list" from /home/zbxdb/zbxdb I get this: (zbxdb-3.9.2) zbxdb@zbxdb:~/zbxdb$ pip list Package Version


certifi 2021.5.30 cffi 1.14.6 charset-normalizer 2.0.5 cryptography 3.4.8 cx-Oracle 8.2.1 hdbcli 2.9.28 ibm-db 3.0.4 idna 3.2 pip 20.2.3 psycopg2-binary 2.9.1 pycparser 2.20 PyMySQL 1.0.2 pyOpenSSL 20.0.1 pypsrp 0.5.0 pyspnego 0.1.6 python-tds 1.11.0 requests 2.26.0 setuptools 49.2.1 six 1.16.0 sqlparse 0.4.2 urllib3 1.26.6

Is there something wrong in my paths or what else?

Thank you very much in advance for you help.

Kind Regards

ikzelf commented 3 years ago

Hi, can you show the contents of your

  1. ~/.bash_profile
  2. ~/zbxdb/.python-version

and the results of "python -V" ?

gianri commented 3 years ago

Sure, here you are:

zbxdb@zbxdb:~$vi ~/.bash_profile source .profile

zbxdb@zbxdb:~$vi ~/zbxdb/python-version zbxdb-3.9.2

zbxdb@zbxdb:~$ python -V Python 3.9.2

In attach you can see the content of my ~/.profile

Thank you very much for your speed in replying .profile.txt

ikzelf commented 3 years ago

A lot looks like I expect ..... What shows "type zbxdb.py" ? Anything in the /home/zbxdb/log/zbxdb_starter.cron ?

gianri commented 3 years ago

zbxdb@zbxdb:/$ type zbxdb.py zbxdb.py is /home/zbxdb/zbxdb/bin/zbxdb.py

The content of /home/zbxdb/log/zbxdb_starter.cron is only this line: mon etc/zbxdb.DB_EUSISASST.cfg 1

ikzelf commented 3 years ago

and "type python" gives /home/zbxdb/.pyenv/shims/python ? If this is so, I see no problems, apart from a message that is telling you otherwise. what if you run from the command line:
zbxdb/bin/zbxdb.py -c etc/zbxdb.DB_EUSISASST.cfg
(you can use -vvv to increase verbosity in logging)

gianri commented 3 years ago

Yes, zbxdb@zbxdb:~/log$ type python python is hashed (/home/zbxdb/.pyenv/shims/python)

zbxdb@zbxdb:~$ zbxdb/bin/zbxdb.py -c etc/zbxdb.DB_EUSISASST.cfg -vvv Traceback (most recent call last): File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 22, in from cryptography.fernet import Fernet, InvalidToken ModuleNotFoundError: No module named 'cryptography'

It seems it doesn't know module 'cryptography', but with the command "pip list" I can see it (I see it only if I run "pip list" from ~/zbxdb/, but not if I run "pip list" from ~/). For this reason I thought there was a problem in some path.

I can't see any more information in "/home/zbxdb/log" even with -vvv option.

ikzelf commented 3 years ago

Following https://stackoverflow.com/questions/7332299/trace-python-imports can you set PYTHONVERBOSE=1 (or higher) in your environment before manually starting zbxdb.py again? Since you have a virtual python environment in ~/zbxdb/ you see the correct modules. Normally this environment is activated if you run a script from that environment. Maybe here something is changed: first cd to ~/zbxdb and then start bin/zbxdb.py -c ../etc/zbxdb.DB_EUSISASST.cfg

gianri commented 3 years ago

Here is from my virtual env (sorry but I can't figure out how to set PYTHONVERBOSE=1):

(zbxdb-3.9.2) zbxdb@zbxdb:~/zbxdb$ bin/zbxdb.py -vvv -c ../etc/zbxdb.DB_EUSISASST.cfg Error during reading log configuration Unable to configure handler 'file_handler' {'version': 1, '_comment': ' copy this file as logging.json in etc/', 'disable_existingloggers': False, 'formatters': {'simple': {'format': '%(asctime)s%(name)s%(levelno)s%(message)s'}, 'onlyTime': {'format': '%(asctime)s_%(name)s %(message)s'}}, 'handlers': {'console': {'class': 'logging.StreamHandler', 'level': 'DEBUG', 'formatter': 'onlyTime', 'stream': 'ext://sys.stdout'}, 'file_handler': {'class': 'logging.handlers.RotatingFileHandler', 'level': 'INFO', 'formatter': 'simple', 'filename': 'log/zbxdb.log', 'maxBytes': 10485760, 'backupCount': 4, 'encoding': 'utf8'}}, 'root': {'level': 'WARNING', 'handlers': ['console', 'file_handler']}} Does the path for filename exist? Traceback (most recent call last): File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/config.py", line 564, in configure handler = self.configure_handler(handlers[name]) File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/config.py", line 745, in configure_handler result = factory(**kwargs) File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/handlers.py", line 153, in init BaseRotatingHandler.init(self, filename, mode, encoding=encoding, File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/handlers.py", line 58, in init logging.FileHandler.init(self, filename, mode=mode, File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/init.py", line 1142, in init StreamHandler.init(self, self._open()) File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/init.py", line 1171, in _open return open(self.baseFilename, self.mode, encoding=self.encoding, FileNotFoundError: [Errno 2] No such file or directory: '/home/zbxdb/zbxdb/log/zbxdb.log'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 1032, in LOG_CONF = setup_logging() File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 62, in setup_logging logging.config.dictConfig(config) File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/config.py", line 809, in dictConfig dictConfigClass(config).configure() File "/home/zbxdb/.pyenv/versions/3.9.2/lib/python3.9/logging/config.py", line 571, in configure raise ValueError('Unable to configure handler ' ValueError: Unable to configure handler 'file_handler'

ikzelf commented 3 years ago

Aha ..... now the import of the modules succeeded. So the problem is in the activation of the Virtual ENVironment. There must be something almost correct in your .profile and/or .bash_profile regarding this. I think that if you copy ~/zbxdb/.python-version to ~/ This will work: zbxdb/bin/zbxdb.py -c etc/zbxdb.DB_EUSISASST.cfg

With export PYTHONVERBOSE=1 you could have set the python verbose parameter to debug the imports.

The errors you just received are because the logging config is not found.

gianri commented 3 years ago

Yes, now something moves! In attach you can find the output of "zbxdb/bin/zbxdb.py -c etc/zbxdb.DB_EUSISASST.cfg".

To be complete, I also attach the logs of my folder /home/zbxdb/log.

May you tell me what's missing now, if something?

Sorry if I'm bothering you but I'm quite confused and I'd like my Oracle DB to be monitored by ZBXDB, because it appears to be a very complete solution. Now, under the path "/home/zbxdb/zbxdb_sender" I can see the subfolders "in" and "archive".

Thank you very much for your time and availability.

zbxdb-bin-zbxdb.py-ouput.txt zbxdb.DB_EUSISASST.cfg.log zbxdb.log zbxdb_sender.log zbxdb_starter.log

ikzelf commented 3 years ago

now that the VENV is working for you, unset PYTHONVERBOSE since it gives too much info now. As far as I can see, etc/zbxdb.DB_EUSISASST.cfg is missing the required [zbxdb] entry. (configparser.NoSectionError: No section: 'zbxdb') Because of this zbxdb.py stops and makes no output files. zbxdb_sender seem to work OK but has nothing to do because zbxdb.py stops with config parser error. Because zbxdb.py stops, zbxdb_starter keeps starting over and over.

fix the config file. If the database connection works, it will generate files in zbxdb_out/ zbxdb_sender will pick them up and transfer the processed files to the archive directory and keep a few days worth of files.

ikzelf commented 3 years ago

Do you have it all up and running now?

gianri commented 3 years ago

Hi ikzelf, I've been trying now... let me keep you updated in a short time.

Thanks again

gianri commented 3 years ago

Since I need to monitor an Oracle RAC DB (I have many DB like this) I created my "etc/zbxdb.DB_EUSISASST.cfg" cloning the pre-installed example file "etc/zbxdb.oracle-cluster1-ASM.cfg", where is present only the section [zbxora] instead of [zbxdb]. It was ok doing this? Now I replaced the entry [zbxora] with [zbxdb] and the error is changed. Here is a tail of my zbxdb.DB_EUSISASST.cfg.log:

2021-09-28 08:50:01,642main30_log level 20 2021-09-28 08:50:01,643main30_start python-3.9.2 zbxdb-3.00 pid=528404 Connecting ...

2021-09-28 08:50:01,643main50_logging: Fatal messages 2021-09-28 08:50:01,643main50_logging: Critical messages 2021-09-28 08:50:01,643main40_logging: Error messages 2021-09-28 08:50:01,643main30_logging: Warning message 2021-09-28 08:50:01,643main20_logging: Info messages 2021-09-28 08:50:01,643main30_first encrypted the plaintext password and removed from config

2021-09-28 08:50:01,643main30_zbxdb found db_type=, driver ; checking for driver

2021-09-28 08:50:01,643main50_problem Traceback (most recent call last): File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 1037, in main() File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 821, in main db_driver = load_driver(_config) File "/home/zbxdb/zbxdb/bin/zbxdb.py", line 307, in load_driver _db_driver = import(_c['db_driver']) ValueError: Empty module name

I already installed Oracle Instant Client, unixODBC, SQLPlus, and the connection to DB using "$sqlplus64 username/mypwd@//MyUrlDB:1521/SID" is working.

ikzelf commented 3 years ago

oops, thanks for the catch. I recently recognised that people could benefit from an example that was in zbxora and that I forgot to include in zbxdb. I just pushed a new version of zbxdb.oracle-cluster1-ASM.cfg that is more complete. Problem in this case was the missing driver details (zbxora was oracle only so the driver was hardcoded in there)

unixODBC is not needed, we use native interfaces.

If you want to monitor RAC instances, smartest is to use tnsaliasses that have the complete address list of the database (or ASM configuration), or specify the address list as I did in the example.

This is because zbxdb will use an available instance to connect and if that goes down, it will try to reconnect, possibly ending up in a different available instance. zbxdb will use the global views for monitoring, where needed.

ikzelf commented 3 years ago

Since ASM is not working nicely with the scan listeners, you need to address them using the vips.

gianri commented 3 years ago

I'm glad to have been useful in some way :) Now it try to connect but it fails due to a user/pwd error. The connection with SQLPlus using the same user/pwd works: is it ok if I'm using a different user from "cistats"? In my .cfg file I wrote the pwd in clear text, and now I can see that in the same file it added the line "password_enc", leaving the line "password" without the value (I think it is correct...)

Here is the log: 2021-09-28 10:21:48,062main20_using sql_timeout : 60s 2021-09-28 10:21:48,063main20_keysdir : etc/keys 2021-09-28 10:21:48,063main20_out_file : /home/zbxdb/zbxdb_out/zbxdb.DB_EUSISASST.zbx

2021-09-28 10:21:48,063main20_connecting to **/**@(DESCRIPTION=(ADDRESS_LIST=(LOAD_BALANCE=OFF)(FAILOVER=ON)(ADDRESS=(PROTOCOL=TCP)(HOST=MyNodeA-vip.asst-fbf-sacco.it)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=MyNodeB-vip.asst-fbf-sacco.it)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=MySID)(SERVER=DEDICATED)))

2021-09-28 10:21:49,193_dbconnections.oracle_40_connect failed 1017 with ORA-01017: invalid username/password; logon denied 2021-09-28 10:21:49,194main40_(1.1)connection error: [1017] ORA-01017: invalid username/password; logon denied for zabbix@(DESCRIPTION=........

ikzelf commented 3 years ago

what is the contents of the zbxdb.DB_EUSISASST.cfg now? More specific, the role parameter. Since you are now not connecting to ASM, I expect a normal role. If you need to monitor a physical standby that is in recovery, or an ASM instance, you would need sysdba as role (and give the user the sysdba role). You should be able to use any username, as long as you take care of the correct privs.

gianri commented 3 years ago

Sorry, you are absolutely right... the problem was the role! Now with normal role it connects. I need to monitor two active instances of the same DB RAC (one for each node): so the normal role should be ok if I'm correct.

Now I can see different .zip files in my "/home/zbxdb/zbxdb_sender/archive2 and nothing in "/home/zbxdb/zbxdb_sender/in" Even "/home/zbxdb/zbxdb_out" is empty.

Probably now I'm missing something else because I doesn't see any data arriving in my Zabbix Server. (is it due from zbxdb_sender that it doesn't now where to send data?)

To recap: My Zabbix Server is on Physical dedicated machine. On Zabbix Server I loaded the pre-installed/sample template "zbxdb_template_v3.xml" and created the host "DB_EUSISASST".

My ZBXDB Server is on a VM dedicated machine, where I already installed zabbix_agent and zabbix_sender (from zabbix package)

ikzelf commented 3 years ago

zbxdb.py writes files to zbxdb_out/ zbxdb_sender moves the files from zbxdb_out/ to zbxdb_sender/in/ and processes them 1 by 1 and adds them to an archive per run in zbxdb_sender/archive/ This allows you to study the contents of the files.

Make sure that the host_name that you defined in your cfg file is exactly as you entered it in the zabbix GUI. This is also the thing where you attach the template to. For a RAC database you only need to specify 1 host and one cfg file. You do need to make sure that zbxdb.py can connect to all your instances. For regular instances you can use the scan addresses.

Since I made zbxdb.py to connect as a regular client to the database cluster, it preferably is not on the database server but on any machine that is able to connect to all your nodes. Normally I pick a zabbix server or a zabbix proxy for that. The template you picked was for zabbix-3. If you still are running zabbix-3, you might want to upgrade it a bit.....

check you zabbix_sender logfile. It should show that it did pick up the files and tried to send the data. It uses zabbix_sender to push the data to zabbix, using the configuration file of the zabbix agent. Lot's of data is discovered (instances, table spaces etc.) and it will take a while before they become visible, especially when a zabbix proxy is involved. Some data is always sent to zabbix and should be available immediately in latest data. One item that should always receive data is "zbxdb[uptime]". If it does not receive data this will trigger an alert because zbxdb might not be able to do everything needed (sample, send data)

gianri commented 3 years ago

I've just loaded the zbxdb_template_v5.xml (in effect my Zabbix ver. is 5.0.13) and associated to my host (in GUI Interface) named DB_EUSISASST. The cfg file name should be correct: zbxdb.DB_EUSISASST.cfg

It seems ZBXDB captures datas, as I can show you from the following logs, but Zabbix Server doesn't receive them:

/home/zbxdb/log/zbxsender.log 2021-09-28 11:57:01,968main30_2021-09-28-1157 processing zbxdb.DB_EUSISASST.zbx 2021-09-28 11:57:01,972main40_zabbix_sender zbxdb.DB_EUSISASST.zbx error: 1 2021-09-28 11:57:01,973main30_removed lock /home/zbxdb/zbxdb_sender/zbxdb_sender.lock 2021-09-28 11:58:02,024main30_Logging in /home/zbxdb/log/zbxdb_sender.log 2021-09-28 11:58:02,025main30_Namespace(cfile='/etc/zabbix/zabbix_agentd.conf', verbosity=0, zbxdb_out='zbxdb_out') 2021-09-28 11:58:02,026main30_Using /etc/zabbix/zabbix_agentd.conf

/home/zbxdb/log/zbxdb.DB_EUSISASST.cfg.log ..... main___40_connect 1 times, 0 fail; started 562 queries, 0 fail memrss:41164 user:3.182097 sys:0.813189 .....

Here is a piece of a zip file present in /home/zbxdb/zbxdb_sender/archive ... DB_EUSISASST "zbxdb[uptime]" 1632823918 5400 DB_EUSISASST "zbxdb[opentime]" 1632823918 5400 DB_EUSISASST "blocked[count]" 1632823918 0 DB_EUSISASST "zbxdb[query,checks_01m,blocked,status]" 1632823918 0 DB_EUSISASST "zbxdb[query,checks_01m,blocked,ela]" 1632823918 0.0035964329726994038 DB_EUSISASST "zbxdb[query,checks_01m,blocked,fetch]" 1632823918 0.0002573130186647177 DB_EUSISASST "db[EUSISASS,openstatus]" 1632823918 3 DB_EUSISASST "zbxdb[query,checks_01m,db.openmode,status]" 1632823918 0 DB_EUSISASST "zbxdb[query,checks_01m,db.openmode,ela]" 1632823918 0.014198767021298409 ...

Probably there is some mistake in zabbix_agent or zabbix_sender configuration?

Here are my entries "Server" and "Hostname" written in /etc/zabbix/zabbix_agentd.conf Server=HostnameOfMyZABBIX_server Hostname=ZBXDB

The hostname ZBXDB is the hostname of my ZBXDB machine, so it can be monitored by Zabbix.

My Zabbix Server sees correctly enabled the ZBXDB host, but nothing about DB_EUSISASST host

Thanks again, really

ikzelf commented 3 years ago

Can you increase the verbosity of zbxdb_sender ? It looks like it receiver an error: 2021-09-28 11:57:01,972main40_zabbix_sender zbxdb.DB_EUSISASST.zbx error: 1

The collected data looks like OK.

I think the problem is your agent config: https://www.zabbix.com/documentation/current/manual/concepts/sender Can you add the ServerActive parameter to it? It should point to the same zabbix server as your Server parameter.

gianri commented 3 years ago

Ok, great! It was missing the ServerActive parameter: now it sends data and in effect I begin to see something in my Zabbix Server.

Now the code error I see in zbxdb_sender.log is 2 (better than 1 I suppose): 2021-09-28 13:03:01,848main30_Using /etc/zabbix/zabbix_agentd.conf 2021-09-28 13:03:01,848main30_2021-09-28-1303 processing zbxdb.DB_EUSISASST.zbx 2021-09-28 13:03:01,860main40_zabbix_sender zbxdb.DB_EUSISASST.zbx error: 2 2021-09-28 13:03:01,862main30_removed lock /home/zbxdb/zbxdb_sender/zbxdb_sender.lock 2021-09-28 13:04:01,901main30_Logging in /home/zbxdb/log/zbxdb_sender.log 2021-09-28 13:04:01,902main30_Namespace(cfile='/etc/zabbix/zabbix_agentd.conf', verbosity=0, zbxdb_out='zbxdb_out')

I beg your pardon... how can I increase the zabbix sender verbosity?

ikzelf commented 3 years ago

To increase verbosity, just add -v to the command line in the crontab. But currently error 2, mostly meaning that not all data got through. This can be because not all auto discovered items are known in zabbix. The auto discovery runs once an hour. Also, there could be queries in the checks file that have no corresponding item[s] in the template.

I think you got it all up now. If you feel that you can improve some of the docu so others might get it easier, feel free to add it and issue a pull request. Same is you feel there are important things missing. Adding new items is quite easy.

You created this issue in the zbxora repository. No problem but it would be a bit easier next time to address the correct repository. That way things don't get mixed up.

gianri commented 3 years ago

Yor're right, sorry I hadn't paid attention opening the issue to the right repository. For sure next time I'll be more careful. Thanks for the invite to improve the docs: I'll be happy to do it... sharing information is the best way to get things easier.

Just one last help. Now my Zabbix after some hour is getting more Items and Triggers discovered (now are 621 items and 537 triggers), but the information that I gather from /home/zbxdb/log/zbxdb_sender.log is not sufficient (for me) to understand why zabbix sender is getting error "2" code, even with verbosity=3:

2021-09-28 16:04:01,513main30_Using /etc/zabbix/zabbix_agentd.conf 2021-09-28 16:04:01,513main30_2021-09-28-1604 processing zbxdb.DB_EUSISASST.zbx 2021-09-28 16:04:01,524main40_zabbix_sender zbxdb.DB_EUSISASST.zbx error: 2 2021-09-28 16:04:01,530main30_removed lock /home/zbxdb/zbxdb_sender/zbxdb_sender.lock 2021-09-28 16:05:01,564main30_Logging in /home/zbxdb/log/zbxdb_sender.log 2021-09-28 16:05:01,565main30_Namespace(cfile='/etc/zabbix/zabbix_agentd.conf', verbosity=3, zbxdb_out='zbxdb_out') 2021-09-28 16:05:01,565main30_Using /etc/zabbix/zabbix_agentd.conf 2021-09-28 16:05:01,565main30_2021-09-28-1605 processing zbxdb.DB_EUSISASST.zbx 2021-09-28 16:05:01,578main40_zabbix_sender zbxdb.DB_EUSISASST.zbx error: 2 2021-09-28 16:05:01,584main30_removed lock /home/zbxdb/zbxdb_sender/zbxdb_sender.lock

The queries that ZBXDB is doing (and that could have not all items corrisponding in the template) are all written in this path, right?: /home/zbxdb/etc/zbxdb_checks/oracle

Meanwhile I hope to become confident in getting with ZBXDB/Zabbix all datas, graphs and statistics (on free tablespaces, sessions, processes numbers...) useful to my monitor objectives.

ikzelf commented 3 years ago

For me that error 2 is not so important so I ignored it. zabbix_sender does not give a lot of logging on it so I can not pass it. A way to debug this, if you really want to, is to unzip the a file from archive/ and send it manually to zabbix, using zabbix_sender. Perform a kind of binary search on it (first/second half of the file etc. until you find the line[s] that cause the "problem[s]". I can not make this easier, sorry for that.

yes, the queries are stored in etc/zbxdb_checks/{db_type}/{dbtype}{version}.cfg. In that directory are also a few examples (ebs, sap, my_super_checks) how you could integrate your own very site specific queries, that can be activated using the site_checks parameter.

I just hope you like it and that it does what you need. I kind of like to think it is a pretty smart tool that can easily be adjusted to do exactly what anyone might want to do with little more than having to write a few SQL queries and create the corresponding items in the template.

gianri commented 3 years ago

Yes, you did a very great job for me creating this DB monitoring tool. We have many DBs, most Oracle RAC but also other types, and for this reason I think ZBXDB could be very useful for us.

I'll ignored the error 2 in zabbix_sender and I'll try to focus my energy in getting interesting statistics/graphs.

For me you can close the issue and I hope I'll not bother you anymore, but if so I'll do it on the right repository ;) Many thanks again!

ikzelf commented 3 years ago

I was glad I could help! And if you run into problems or have good ideas for additions etc. let them know! and yes, preferably in the correct repo. :-D