ikzelf / zbxdb

Zabbix database monitoring, the easy and extendable way
GNU General Public License v3.0
94 stars 45 forks source link

services query gives errors due to difference between lld query and "monitoring" query in CDB architecture #99

Open dhoogfr opened 2 years ago

dhoogfr commented 2 years ago

Describe the bug The services check reporting the number of sessions per service uses a lld query that uses the gv$services view. This view assigns the SYS$BACKGROUND sessions to the CDB$ROOT PDB.

select decode (s.con_id, 0, d.name, pdb.name) "{#PDB}", i.instance_name "{#INST_NAME}",
                    nvl(s.name,s.network_name) "{#SERVICE_NAME}"
from gv$services s
     join gv$instance i
       on ( s.inst_id = i.inst_id)
     cross join v$database d

The query used for the actual monitoring uses the gv$session view, which puts the SYS$BACKGROUND processes on container id 0.

select 'service['||decode (s.con_id, 0, d.name, pdb.name)||','||i.instance_name||','|| s.service_name||',sess]', count(*)
from gv$session s 
     join gv$instance i
       on ( s.inst_id = i.inst_id )
     left join v$containers pdb
       on ( s.con_id = pdb.con_id )

Bot the lld query and the monitoring query have a decode construct to map container id 0 to the CDB name, but because the gv$services view puts the background processes to the CDB$ROOT (at least in the tested 19c databases), the items do not match.

Because of this mismatch, the zabbix_sender gives a result code 2 and the background sessions item in Zabbix is never filled.

To Reproduce Steps to reproduce the behavior:

  1. deploy zbxdb on a 19c CDB (tested patch level 19.15)
  2. check discovered services items in Zabbix
  3. monitor zbxdb_sender logfile to detect error code 2
  4. check background services item in zabbix and see that it is never filled

Expected behavior

Both lld and monitoring query attributes the same sessions to the same "PDB".

A possible solution would be to change the decode in the monitoring query to use CDB$ROOT for container id 0 instead of the CDB name. The cross join with v$database can then also be removed

select 'service['||decode (s.con_id, 0, 'CDB$ROOT', pdb.name)||','||i.instance_name||','|| s.service_name||',sess]', count(*)
from gv$session s 
     join gv$instance i
       on ( s.inst_id = i.inst_id )
     left join v$containers pdb
       on ( s.con_id = pdb.con_id )
group by
  i.instance_name, s.service_name, decode (s.con_id,0, 'CDB$ROOT', pdb.name)
;

For the lld query, the decode, together with the cross join, can then be removed

Additional context

ikzelf commented 2 years ago

Thanks for the issue Freek, I will check and see what I can do. I prefer not to have the hard coded CDB$ROOT in the sql's since I would like to have it working in classic oracle too.

dhoogfr commented 2 years ago

Ah yes, good point. I did not took in account that the con_id 0 is also used when the database don't use the CDB architecture.

A nested decode construct or change to case statement is perhaps possible (with a count on v$containers to see if there is more than 1 record, to determine if db name or CDB$ROOT should be used).

ikzelf commented 3 months ago

I just found back this issue. Does this still live or did you already implement a solution for this? (if so, are you willing to share?) I completely forgot it (also did not have time/energy available to research it)