oracle / oracle-db-appdev-monitoring

Metrics exporter and samples for unified observability for data-centric app dev and microservices
http://developer.oracle.com/microservices
Other
78 stars 23 forks source link

Tablespace exposed metrics do not match the query executed on the database #151

Open aureliocirella opened 2 days ago

aureliocirella commented 2 days ago

Hi, we have noticed a mismatch between the metrics TEMP tablespace below: oracledb_tablespace_bytes oracledb_tablespace_free oracledb_tablespace_max_bytes oracledb_tablespace_used_percent

And the values retrieved performing the tablespace query directly on the DB.

This happens on OCI for Oracle Autonomous Databases with Standby role.

This is an example for oracledb_tablespace_free and oracledb_tablespace_used_percent: OeZHW2AYDA

The same happens for other tablespace metrics related to the TEMP tablespace.

Image used: container-registry.oracle.com/database/observability-exporter:1.5.1

Could you please support us?

Thanks. Regards, Aurelio

markxnelson commented 2 days ago

Thanks for reporting, will take a look and get back to you shortly

markxnelson commented 1 day ago

doing some research, in my adb i see the following

SQL> SELECT
  2      dt.tablespace_name as tablespace,
  3      dt.contents as type,
  4      dt.block_size * dtum.used_space as bytes,
  5      dt.block_size * dtum.tablespace_size as max_bytes,
  6      dt.block_size * (dtum.tablespace_size - dtum.used_space) as free,
  7      dtum.used_percent
  8  FROM  dba_tablespace_usage_metrics dtum, dba_tablespaces dt
  9  WHERE dtum.tablespace_name = dt.tablespace_name
 10* ORDER by tablespace;

TABLESPACE    TYPE                    BYTES        MAX_BYTES             FREE                                    USED_PERCENT
_____________ ____________ ________________ ________________ ________________ _______________________________________________
DATA          PERMANENT       1549948551168    5004329328640    3454380777472       30.97215329729750789289573209090501556652
DBFS_DATA     PERMANENT            76677120    3286342410240    3286265733120    0.002333205443263604017944294196566501234673
SYSAUX        PERMANENT         19705823232    3308575825920    3288870002688      0.5955983561755153404466787116142503958829
SYSTEM        PERMANENT          3655008256    3289950560256    3286295552000      0.1110961453389011981971064310648913683516
TEMP          TEMPORARY          6371147776    1979120934912    1972749787136      0.3219180628940842311562327101257955610939
UNDO_C7D0D    UNDO             236797820928     329853493248      93055672320       71.78878677206059139876676501681260739546

6 rows selected.

SQL>

# HELP oracledb_tablespace_bytes Generic counter metric of tablespaces bytes in Oracle.
# TYPE oracledb_tablespace_bytes gauge
oracledb_tablespace_bytes{tablespace="DATA",type="PERMANENT"} 1.550361690112e+12
oracledb_tablespace_bytes{tablespace="DBFS_DATA",type="PERMANENT"} 7.667712e+07
oracledb_tablespace_bytes{tablespace="SYSAUX",type="PERMANENT"} 1.9705823232e+10
oracledb_tablespace_bytes{tablespace="SYSTEM",type="PERMANENT"} 3.655008256e+09
oracledb_tablespace_bytes{tablespace="TEMP",type="TEMPORARY"} 2.770337792e+09
oracledb_tablespace_bytes{tablespace="UNDO_C7D0D",type="UNDO"} 2.36797820928e+11
# HELP oracledb_tablespace_free Generic counter metric of tablespaces free bytes in Oracle.
# TYPE oracledb_tablespace_free gauge
oracledb_tablespace_free{tablespace="DATA",type="PERMANENT"} 3.453967638528e+12
oracledb_tablespace_free{tablespace="DBFS_DATA",type="PERMANENT"} 3.28626573312e+12
oracledb_tablespace_free{tablespace="SYSAUX",type="PERMANENT"} 3.288870002688e+12
oracledb_tablespace_free{tablespace="SYSTEM",type="PERMANENT"} 3.286295552e+12
oracledb_tablespace_free{tablespace="TEMP",type="TEMPORARY"} 1.97635059712e+12
oracledb_tablespace_free{tablespace="UNDO_C7D0D",type="UNDO"} 9.305567232e+10
# HELP oracledb_tablespace_max_bytes Generic counter metric of tablespaces max bytes in Oracle.
# TYPE oracledb_tablespace_max_bytes gauge
oracledb_tablespace_max_bytes{tablespace="DATA",type="PERMANENT"} 5.00432932864e+12
oracledb_tablespace_max_bytes{tablespace="DBFS_DATA",type="PERMANENT"} 3.28634241024e+12
oracledb_tablespace_max_bytes{tablespace="SYSAUX",type="PERMANENT"} 3.30857582592e+12
oracledb_tablespace_max_bytes{tablespace="SYSTEM",type="PERMANENT"} 3.289950560256e+12
oracledb_tablespace_max_bytes{tablespace="TEMP",type="TEMPORARY"} 1.979120934912e+12
oracledb_tablespace_max_bytes{tablespace="UNDO_C7D0D",type="UNDO"} 3.29853493248e+11
# HELP oracledb_tablespace_used_percent Gauge metric showing as a percentage of how much of the tablespace has been used.
# TYPE oracledb_tablespace_used_percent gauge
oracledb_tablespace_used_percent{tablespace="DATA",type="PERMANENT"} 30.9804089279099
oracledb_tablespace_used_percent{tablespace="DBFS_DATA",type="PERMANENT"} 0.002333205443263604
oracledb_tablespace_used_percent{tablespace="SYSAUX",type="PERMANENT"} 0.5955983561755154
oracledb_tablespace_used_percent{tablespace="SYSTEM",type="PERMANENT"} 0.11109614533890119
oracledb_tablespace_used_percent{tablespace="TEMP",type="TEMPORARY"} 0.13997819653821109
oracledb_tablespace_used_percent{tablespace="UNDO_C7D0D",type="UNDO"} 71.78878677206059
markxnelson commented 1 day ago

The differences in DATA and UNDO are normal, as work is being done in the database, so those should change over time.

Investigating whey TEMP bytes is different.

aureliocirella commented 1 day ago

Hi @markxnelson, thanks for taking a look at it. I'll wait for an update about TEMP tablespace then.

Regards, Aurelio

markxnelson commented 1 day ago

Referring to the documentation for this view: https://docs.oracle.com/en/database/oracle/oracle-database/21/refrn/DBA_TABLESPACE_USAGE_METRICS.html

It says the following:

"If the tablespace contains any datafiles with autoextend enabled, then this column displays the maximum size to which the tablespace can grow. Underlying storage free space, such as Oracle ASM or file system storage, is also taken into account when computing this value."

Since ADB-S is a shared environment, other instances are likely to be on the same underlying storage too. I notice that I sometimes get the same result when I run the query and read the metric at the same time (within a few seconds), and sometimes they differ. My friendly DBA also told me "temp segment usage fluctuates rapidly, actual free space is kept in memory (until checkpointers come around and write down a point-in-time free)."

I think this explains what we are seeing here.