cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.21k stars 3.82k forks source link

Create method for visualizing events in logs / cockroach debug zips #29184

Open tim-o opened 6 years ago

tim-o commented 6 years ago

As either a support team member or an operator of CRDB, it's difficult to use cockroachDB's log files. Any time an issue is encountered, a significant amount of time is spent simply collecting and collating log files. This is time consuming and error prone. It'd be very helpful to create a script or a web application that could:

1) Take a cockroach debug zip as input. 2) Parse the logs and create a visualization of the frequency of errors and warnings over time, total and by node. 3) Show a list of errors by their frequency in the logs, total and by node, with the first and last occurrence. 4) Allow the user to drill down to see the distribution of a particular error over time.

... there are probably other things that we'd want to visualize, but this would be a very helpful start and would avoid a lot of manual hunting and pecking, and false starts.

Jira issue: CRDB-4877

couchand commented 6 years ago

Potentially related: #18845

knz commented 6 years ago

@tschottdorf @petermattis who would be the PM to look at this? which project does this fall in?

tbg commented 6 years ago

Sounds like webui to me. cc @piyush-singh. I also don't think this is feasible because our errors aren't structured. Plus, our logs are subject to change as we clean them up (I think cleaning them up is more important than trying to build tools that deal with the fact that they're messy).

knz commented 6 years ago

I'm not entirely sure this should be displayed in the CockroachDB web UI though, because perhaps by the time these logs are collected the web UI is inoperative.

Having some kind of text processor on the log files that does the frequency analysis suggested by Tim, using fuzzy matching (text distance below some threshold) would probably work.

This is really an issue requesting the creation of new tooling, either as a new cockroach sub-command or a separate tool. I don't think there is any extant component in CockroachDB that goes in this direction already.

knz commented 6 years ago

Agreed that @piyush-singh could prioritize this, although I suspect that @kannanlakshmi would like to prioritize this too as it will greatly aid the troubleshooting of managed clusters.

github-actions[bot] commented 3 years ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 5 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!