perforce / p4prometheus

[Community Supported] Perforce (Helix Core) interface for writing Prometheus metrics from real-time analysis of p4d log files.
MIT License
48 stars 24 forks source link

fix: collect lock data when lslocks has null path values #49

Open nagonzalez opened 10 months ago

nagonzalez commented 10 months ago

This PR supports collecting lslocks lock data even when path is null. These changes were validated on the following systems:

lslocks output from Rocky 8.x

lslocks -J -o +BLOCKER
{
   "locks": [
      {"command": "p4d_1_bin", "pid": "2013551", "type": "FLOCK", "size": null, "mode": "WRITE", "m": "0", "start": "0", "end": "0", "path": null, "blocker": "123"}
   ]
}

lslocks output on CentOS 7.x

COMMAND           PID  TYPE SIZE MODE  M      START        END PATH BLOCKER
p4d_1_bin       48134 FLOCK   0B WRITE 0          0          0        123

sample output in /p4/1/logs/monitor_metrics.log

2024-01-22 04:27:01 pid 10458, user , cmd , table None, blocked by pid 689, user svc-account1, cmd sync, args -q --parallel threads=4,min=1,minsize=1 //...@5554459
DEBUG 2024-01-22 04:27:01,891 monitor_metrics.py 318: Writing to metrics file: /depotdata/metrics/locks.prom
DEBUG 2024-01-22 04:27:01,891 monitor_metrics.py 319: Metrics: # HELP p4_locks_db_read Database read locks
# TYPE p4_locks_db_read gauge
p4_locks_db_read{serverid="server_id_value",sdpinst="1"} 0
# HELP p4_locks_db_write Database write locks
# TYPE p4_locks_db_write gauge
p4_locks_db_write{serverid="server_id_value",sdpinst="1"} 0
# HELP p4_locks_cliententity_read clientEntity read locks
# TYPE p4_locks_cliententity_read gauge
p4_locks_cliententity_read{serverid="server_id_value",sdpinst="1"} 0
# HELP p4_locks_cliententity_write clientEntity write locks
# TYPE p4_locks_cliententity_write gauge
p4_locks_cliententity_write{serverid="server_id_value",sdpinst="1"} 0
# HELP p4_locks_meta_read meta db read locks
# TYPE p4_locks_meta_read gauge
p4_locks_meta_read{serverid="server_id_value",sdpinst="1"} 0
# HELP p4_locks_meta_write meta db write locks
# TYPE p4_locks_meta_write gauge
p4_locks_meta_write{serverid="server_id_value",sdpinst="1"} 0
# HELP p4_locks_cmds_blocked cmds blocked by locks
# TYPE p4_locks_cmds_blocked gauge
p4_locks_cmds_blocked{serverid="server_id_value",sdpinst="1"} 1
nagonzalez commented 10 months ago

II believe this also may mitigate issue here: https://github.com/perforce/p4prometheus/issues/37

nagonzalez commented 10 months ago

Here's some sample telemetry captured:

grep 'blocked by pid' /p4/1/logs/monitor_metrics.log
2024-01-24 20:47:01 pid 134136, user service-account, cmd client, table None, blocked by pid 134022, user user1, cmd sync, args t:\depot\...#head
2024-01-24 20:47:01 pid 134135, user service-account, cmd client, table None, blocked by pid 134022, user user1, cmd sync, args t:\depot\...#head
2024-01-24 20:47:01 pid 134136, user service-account, cmd client, table None, blocked by pid 134022, user user1, cmd sync, args t:\depot\...#head
2024-01-24 20:47:01 pid 134135, user service-account, cmd client, table None, blocked by pid 134022, user user1, cmd sync, args t:\depot\...#head
rcowham commented 10 months ago

Hi - can you also update the test harness please? test_monitor_metrics.py

nagonzalez commented 10 months ago

yeah, definitely. Lemme' take a look. I'll reach out if I have any questions.

nagonzalez commented 10 months ago

@rcowham : test harness updated.

I guess better said, the harness already had good sample data but tests weren't passing. monitor_metrics.py was updated to make tests pass

rcowham commented 6 months ago

Can you check against latest - should have merged this in effectively, including the use of "sudo" where possible to improve output.