Closed ddmo closed 4 years ago
Can you increase the verbosity (-vv) The possibility exists there was still a process running, the other option is that the process crashed. What the process should do is 0) lock 1) move all files from zbxora_out/ to zbxdb_sender/in/ 2) for every file in zbxdb_sender/in/ send to zabbix and add to an archive in zbxdb_sender/archive/ 3) unlock 4) clean older archives
Any additional information with -vv parameter (these log after restart and clean all):
2020-10-19 10:03:01,995___main___30_Logging in /home/zbxdb/log/zbxdb_sender.log 2020-10-19 10:03:01,996___main___30_Namespace(cfile='/etc/zabbix/zabbix_agentd.conf', verbosity=2, zbxdb_out='zbxora_out') 2020-10-19 10:03:01,997___main___30_Using /etc/zabbix/zabbix_agentd.conf 2020-10-19 10:03:01,997___main___30_2020-10-19-1003 processing zbxdb.odb.zbx 2020-10-19 10:03:02,045___main___40_zabbix_sender zbxdb.odb.zbx error: 2 2020-10-19 10:04:02,192___main___30_Logging in /home/zbxdb/log/zbxdb_sender.log 2020-10-19 10:04:02,194___main___30_Namespace(cfile='/etc/zabbix/zabbix_agentd.conf', verbosity=2, zbxdb_out='zbxora_out') 2020-10-19 10:04:02,194___main___30_2020-10-19-1004 previous run still running(or crashed(lock file: /home/zbxdb/zbxdb_sender/zbxdb_sender.lock))
The sender script fails because any new collected data after crash. Maybe the process crash during unlock operation? How to verify? Thanks a lot
The run from 10:03 seems to stop for some reason. After zabbix_sender returned error code 2 (sending failed) should continue with archiving. Did errors pop-up in the other zbxdb_sender log files? Could it be that there is still a process running that started around 10:03? It would help to see that.
what is in zbxdb_sender/archive/?
Any other messages.
The process is still running:
zbxdb 50902 1 0 09:53 ? 00:00:00 /home/zbxdb/.pyenv/versions/3.6.5/bin/python3 /home/zbxdb/zbxdb/bin/zbxdb.py -c etc/zbxdb.odb.cfg
but since last error, any data collected in archive directory.
This looks like OK, The zbxdb.py process[es] should keep running, the zbxdb_sender.py only once a minute to send the files generated by zbxdb.py.
mmm ok... Any additional checks? Thanks
At this moment I still have no idea what to do for extra checks .... but I am very open for suggestions that make this work better.
I've tried to send manually data to zabbix server. Here the response:
` zbxdb@cszabbix:~$ zabbix_sender -c /etc/zabbix/zabbix_agentd.conf -T -i /home/zbxdb/zbxdb_sender/archive/2020-10-19-1219/zbxdb.odb.zbx
Response from "192.168.97.9:10051": "processed: 31; failed: 219; total: 250; seconds spent: 0.002043"
Response from "192.168.97.9:10051": "processed: 136; failed: 112; total: 248; seconds spent: 0.002767"
sent: 498; skipped: 0; total: 498
zbxdb@cszabbix:~$ echo $?
2 `
I think it might be the problem...
That tells there is a problem/mismatch between what is sent and what is known by the zabbix server. For example, once an hour a table spaces discovery is done but the data is just blindly sent to zabbix. After a while the table spaces will become known. There are some 28 discoveries .... This should not be the reason to stop/crash without an orderly exit where the lock file is removed.
How is your status now? Is your data coming in, like it should be? And as always, if you have suggestions how to improve, I would be more than happy to make things better but just need a bit of input.
Hi! Now the script seem to be ok but I don't know the reason :) Thanks a lot
Hi, I got this error some minutes after start the script:
I've tried to stop all, kill the process, remove the lock file but, after restart, I've got the same results. Can you help me? Thanks