Open chuck-alt-delete opened 1 year ago
Add to the list: if a query just won’t quit, kill it:
Many drivers implement cancel. But users can always run select pg_backend_pid() then select pg_cancel_backend() in a second connection if they super need it .
More to add to the list: “We should have some quickly accessible playbooks/checklists for people to run through. Like “why is my query slow?”:
OOM monitoring
Adding to the list of "why is my query slow?"
serializable
transaction isolation to trade off freshness for response latency?
Documentation request
We have some great monitoring queries here in the demos repo for datadog and prometheus. Those introspection queries should be documented in a "monitoring materialize" section of the docs. It makes sense to me to combine monitoring and troubleshooting onto a single page, where we first help users with basic monitoring and then move to practical troubleshooting (eg are you in the right cluster? is there an index?) and then advanced troubleshooting (inspecting timely dataflow workers).
Here are a few other queries from my personal collection in addition to the ones:
Computational progress. Approximate the lag of materialized views and indexes. You can use this to, for example, check whether a new replica has caught up to an older replica when doing a zero downtime scale up/down.
A more human friendly source status history:
Permissions queries from here for RBAC may also be relevant in this section.
Affected Pages
No response
Related work
No response