perforce / p4prometheus

[Community Supported] Perforce (Helix Core) interface for writing Prometheus metrics from real-time analysis of p4d log files.
MIT License
48 stars 24 forks source link

Consider using 'p4 pull -ls' to get pull queue size in monitor_pull when gathering metrics. #55

Open sgcommons opened 7 months ago

sgcommons commented 7 months ago

https://github.com/perforce/p4prometheus/blob/7ea8505052759f09d44d0ea22be56d247294848b/scripts/monitor_metrics.sh#L581

On replica/edge servers with large rdb.lbr content, 'p4 pull -l' can read lock rdb.lbr for extended periods of time:

2024/04/22 10:33:10 pid 2952365 bruno@bruno_edge1_ws 127.0.0.1 [p4/2023.2/LINUX26X86_64/2578891] 'user-pull -l' --- rdb.lbr --- pages in+out+cached 345601+0+96 --- locks read/write 1/0 rows get+pos+scan put+del 0+0+8775518 0+0 --- total lock wait+held read/write 0ms+199211ms/0ms+0ms

and running this every 60 seconds from the cron can block metadata replication. I believe 'p4 pull -ls' could be used to obtain the pull queue size and seems to have much less of a read-lock impact on rdb.lbr:

2024/04/22 10:37:47 pid 2953239 bruno@bruno_edge1_ws 127.0.0.1 [p4/2023.2/LINUX26X86_64/2578891] 'user-pull -ls' --- rdb.lbr --- pages in+out+cached 345601+0+96 --- locks read/write 1/0 rows get+pos+scan put+del 0+0+8775518 0+0 --- total lock wait+held read/write 0ms+5555ms/0ms+0ms