grafana / incident-community

Public repository for Grafana Incident feedback, bug reports, discussions, and updates.
25 stars 2 forks source link

Welcome to the Grafana Incident community

Grafana Incident takes away the toil, letting your teams focus on what's important when things go wrong.

Feedback


Screenshot of Grafana Incident tool

Features

Declaring an incident is easy. You can do it in the web UI, or right from the chat.

Assigning roles helps everyone know who’s doing what. An investigator is assigned first; the person responsible for figuring out what’s going on, or finding someone who can. For meatier incidents, a Commander is assigned, who takes charge of the incident, keeping everyone up-to-date and making sure nothing gets forgotten.

A chatbot offers a command-line interface for managing incidents. The chatbot also looks out for interesting context shared in the chat.

Slackbot command help

For example, if you post a link to a GitHub issue, it is attached to the incident and shows up on the page. Grafana Incident synchronises the status, so you can easily see what’s done and what’s left to do. Whether that’s GitHub issues and pull requests, JIRA tickets, Grafana dashboards, or external links, you can passively build up a picture of what’s going on.

AttachContext

Grafana Incident will even suggest related dashboards which is perfect for when it’s your first time on-call. Suggestbot uses machine learning to look for Grafana dashboards that may be related to what’s going on. Using the title of the incident, it searches your dashboards for those which seem related based on an NLP (Natural Language Processing) understanding of their titles. This is the first step in an exciting direction for Grafana Labs, and we can't wait to expand the insights into your incidents in the future.

Suggestbot

Keep track of TODO items with the built-in task manager. Easily add tasks and assign work, so nothing falls through the cracks.

Task management is easy in Grafana Incident

The tool automatically builds a timeline of activity, helping you gain valuable insights into what went on, and how your response process is working, or not.

Use the Present tool to discuss recent incidents for improved transparency, and giving you the opportunity to learn when things go wrong, and prevent them from happening again.