grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
https://grafana.com
GNU Affero General Public License v3.0
63.84k stars 11.95k forks source link

Dashboards as configuration #9957

Closed bergquist closed 6 years ago

bergquist commented 6 years ago

When introducing Dashboard folders in 5.0 we decided to rewrite the current support for [dashboard.json]. More details about why in this comment. While doing so, we would also like to start working on improving this flow.

The end goal would be to make it possible to manage dashboards in Grafana without using Grafanas UI.

The use cases we want to cover.

Timeline

Part1 5.0-alpha2 #9654

Part2

Part3

To be decided.

Requirements for the configuration

Suggestions for the configuration:

1) Include org_id, dashboard folder in the dashboard json at the root level. (nonbreaking since we have default values for org_id and dashboard folder)

{
  "title": "http requests",
  "org_id": 2,
  "rows: "..."
}

2) Store config files with meta data. That config file can include title, org_id, dashboard folder, the dashboard json or a link to the dashboard json. (breaking change)

title: 'http requests'
org_id: 2
dashboard_folder: 'ops'
content: 'file://http_requests.json'

3) Only configure sources to read from. Ex an folder or web server allowing list requests. It would be possible to allow org_id/dashboard folder overrides in the dashboard json. Should be possible to have multiple config files with sources.

sources: 'https://github.com/grafana/grafana/tree/master/public/dashboards'
org_id: 2
brancz commented 6 years ago

Store dashboards as files on disk or remote source (git/http) Updated the dashboards upon events (fsevent/webhook) Disable Users from editing dashboards from within Grafana (optional per dashboard)

This is on spot on how we would love to operate Grafana. Effectively fully declaratively.

Store dashboards as jsonnet

Not sure I understand this. Jsonnet could be used to generate the current json blobs. I think a versioned spec of the dashboard json would be the best thing, then we can build tooling (with jsonnet for example) to automate dashboard generation (we do this today with https://github.com/weaveworks/grafanalib, but I've been thinking of switching to jsonnet for this purpose as well, as we already use jsonnet elsewhere).

How do we handle HA setups where servers might not have the same dashboard. (what if ETCD crash etc)

My take on this is that if we're provisioning Grafana from files from disk, it's not the Grafana server's responsibility, to keep things in sync - it's the responsibility of the environment (configmanagement, kubernetes, mesos, etc). Therefore if the servers are configured in the same way, HA is just a matter of giving each peer the same configuration.

How to implement support importing dashboards from grafana.net from the backend.

I imagine this would work with the remote fetching feature. This would actually be slick, but the update story gets a bit more tricky. How do you know whether you need to update a remote dashboard (probably with caching responses a la "Unmodified").

Should it be possible to delete dashboards?

The way we have handled this with the grafana-watcher so far is, that we completely truncate the database, and re-provision from disk, so that the Grafana database is always in sync with the configuration on disk.

Suggestions for the configuration:

As with the datasource proposal, I would love to see globbing support :slightly_smiling_face: .


Highly appreciate that you're tackling this! :tada:

bergquist commented 6 years ago

ping @tomwilkie

tomwilkie commented 6 years ago

+1 looks pretty close to what I was hoping for! Thanks for this.

Support storing dashboards as jsonnet instead of json

Agree with @brancz, this seems like a property of the config management system. As with Kubernetes, the data structure of the objects should we well documented and backwards compatible, with sensible defaults, and ideally we should have a bunch of different tooling for managing them (jsonnet, grafanalib etc)

I've been thinking of switching to jsonnet for this purpose as well, as we already use jsonnet elsewhere).

@brancz checkout grafana/grafonnet-lib and https://github.com/tomwilkie/ksonnet-prometheus/blob/master/grafana.jsonnet. I've going to see if I can merge some of my requirements into the grafana library.

How should this be configured?

The current method, of specifying a directory in the grafana.ini, worked for me. Having a separate "index.yaml" file listing all the dashboards with extra metadata would also work. Mixing yaml, ini's and json to configure Grafana might be a little... strange.

How do we handle HA setups where servers might not have the same dashboard. (what if ETCD crash etc)

My take on this is that if we're provisioning Grafana from files from disk, it's not the Grafana server's responsibility, to keep things in sync

+1 agree; we run out Grafana's independently using local sqllite DBs. Its my responsibility (using Kubernetes and ConfigMaps) to keep them in sync.

Should it be possible to delete dashboards?

I would say not from the UI/API if they are read from disk or marked readonly; it would be good if they were 'deleted' when the corresponding file was removed from disk.

It should be possible to associate dashboards to orgs and dashboard folders.

Perhaps dashboards could mirror the on-disk directory hierarchy?

bergquist commented 6 years ago

I agree that storing dashboards as jsonnet can be out of scope for Grafana. But I really want to support reading simple jsonnet files from disk/http and transform them before saving it to the database. You are both very experienced user so setting up grafanalib or simluar is no problemo. I want to enable anyone to start writing their dashboards as text using jsonnet and asking them to use another tool might be too much. Anyway, I'm moving that to part3 of this feature.

+1 agree; we run out Grafana's independently using local sqllite DBs. It's my responsibility (using Kubernetes and ConfigMaps) to keep them in sync.

So if want to run multiple pods per customer. Would each instance have its own sqlite? We definitely want to support multiple pods sharing the same database.

Mixing yaml, ini's and json to configure Grafana might be a little... strange.

We also support toml 🙄 Both ini and toml was chosen before yaml won the config war.

Perhaps dashboards could mirror the on-disk directory hierarchy?

Yes! Something like this would be awesome!

Thank you for the feedback! :)

FYI I removed it from the 5.1 milestone since I'm not sure we will be able to focus on part 2 right after 5.0.

brancz commented 6 years ago

So if want to run multiple pods per customer. Would each instance have its own sqlite? We definitely want to support multiple pods sharing the same database.

The idea of the grafana-watcher is that this is the case, effectively we want to operate grafana in a "stateless" manner, as in any new pod is able to re-produce the same database. This relieves us from the burden of having to maintain a mysql cluster for our dashboarding solution :slightly_smiling_face: .

torkelo commented 6 years ago

You will still need a shared mysql server for things like users, user preferences, starred dashboards, dashboard view counts, grafana annotations, etc

brancz commented 6 years ago

You will still need a shared mysql server for things like users, user preferences, starred dashboards, dashboard view counts, grafana annotations, etc

If those are the features you need, then yes, but we don't use any of those, and put the bitly/oauth2_proxy in front of it and turn off any authentication from Grafana side. As a matter of fact, we just want to use Grafana as a tool that is fully configured from files, and then is a "read only" interface.

torkelo commented 6 years ago

If those are the features you need

It does not matter if you use them or not you will still need a database :)

metalmatze commented 6 years ago

Instead of going for yet another file format, it would be nice to see YAML here too. Prometheus went with YAML in v2 for their alerts and it's liberating. Just my 2 cents on the topic. Otherwise this looks promising, looking forward to use something from Grafana itself and not a bunch of hacks!

brancz commented 6 years ago

Right, but if you don't use any of those features (or disable them), there is no problem in operating each Grafana server with it's own "local" sqlite database, as they will just be provisioned in the same way.

The whole point of why we're doing this is that we are not happy with having to maintain a complex system like a mysql cluster, just for dashboarding which we want to operate in a read-only fashion anyways, as we check-in and review our dashboards in git and then deploy the declarative grafana servers off of those.

I understand that there are users, and use-cases where people may want the features that require the mysql database. I'm trying to express what we would like our ideal dashboarding solution to look like, and how that philosophy influences how we operate Grafana.

torkelo commented 6 years ago

Right, but if you don't use any of those features (or disable them), there is no problem in operating each Grafana server with it's own "local" sqlite database, as they will just be provisioned in the same way.

Some of those features can't be disabled (like ability to create annotations directly from graph panel).

The whole point of why we're doing this is that we are not happy with having to maintain a complex system like a mysql cluster, just for dashboarding which we want to operate in a read-only fashion anyways, as we check-in and review our dashboards in git and then deploy the declarative grafana servers off of those.

That should be fine as long as you don't use alerting. If you have multiple grafana servers with the same dashboards (That have alert rules) you need a single grafana database if you don't want duplicate alert rule evaluations and notifications.

bergquist commented 6 years ago

We should support both scenarios. Supporting a shared database makes it harder but still possible.

Initial suggestion for configuration format:

  # the purpose of name is to make it easier to debug 
  # and find errors in case dashboard imports fail.
- name: 'Cobra Kais dashboards' 
  # org where dashboards should be inserted. defaults to 1
  org_id: 2
  # dashboard folder(released in grafana 5.0) where all dashboards should be inserted
  folder: Cobra Kai
  # how to read dashboards from remote source. http/file/grafana.com etc
  type: http
  # options for each remote source type. 
  options:
    url: https://github.com/grafana/grafana/tree/master/examples
    basicAuthUsername: Kai
    basicAuthPassword: IAMGOD
- name: general dashboards
  type: file
  options:
    folder: /var/lib/grafana/dashboards

- name: Cobra Kais dashboards
  org_id: 2
  folder: Cobra Kai
  type: http
  options:
    url: https://github.com/grafana/grafana/tree/master/examples
    basicAuthUsername: Kai
    basicAuthPassword: IAMGOD

- name: 'Dashboards from Grafana.com'
  type: grafana
  options:
    id: 1
    datasource: production
    prefix: collectd

We will scan /conf/datashboards for yaml|yml config and use all of them. Each file contains a list of sources for dashboards that we will load intro Grafana. This should make it flexible enough to support most use cases while still very easy to reason about in a small setup.

default configuration file shipped with grafana (commented out)

- name: 'default'
  org_id: 1
  folder: ''
  type: file
  options:
    folder: /var/lib/grafana/dashboards
brancz commented 6 years ago

That should be fine as long as you don't use alerting. If you have multiple grafana servers with the same dashboards (That have alert rules) you need a single grafana database if you don't want duplicate alert rule evaluations and notifications.

We use Grafana for dashboarding purposes only, and Prometheus for alerting. As I said, I'm not arguing Grafana shouldn't have the features it has, we're just trying to optimize for the "declarative dashboaring" use case. For us Grafana could ideally have a fully read-only mode for this and is entirely configured from files.

@bergquist all of the above sounds good to me. Generally I would be cautious of adding too many possibilities but if you limit it to those three, I think it should be appropriate. (I'm only mentioning it as "provider" maintenance can quickly get out of hand, by experience)

metalmatze commented 6 years ago

After watching a video of a talk about ksonnet by @hausdorff and digging a bit more into ksonnet, I actually come to the conclusion now, that something similar for grafana might be very cool.

With some very decent defaults I can imagine gsonnet (😜) being really helpful in solving the problems.

bergquist commented 6 years ago

@metalmatze something like that is currently happing at https://github.com/grafana/grafonnet-lib

wtrocki commented 6 years ago

@bergquist Is is possible to contribute to any feature from list above from community point of view? Prototype for importing dashboards from git/github will be nice to do.

leth commented 6 years ago

We have our own setup to manage dashboards as configuration, so I'm all for it, however, I just thought I'd add something to consider regarding editing pre-configured dashboards.

When I want to improve the dashboard, or dig deeper into what it's showing me, I edit the dashboard in the grafana UI. Of course, if I make a change I want to keep, I can't save it, so I have to mentally translate the change back to our grafanalib definition.

I think the immediacy of the UI-based editing feedback loop is really important -- something like a click-through warning about read-only dashboards might be better than fully disabling editing.

bergquist commented 6 years ago

@leth We agree. In the future provisioned dashboard won't be possible to save, but possible to edit.

bergquist commented 6 years ago

We won't add support for fetching dashboards from http/github right now so I'm considering this issue done. If you want to fetch the dashboards from HTTP I suggest you write a script the pull your dashboards to disk on the same server as Grafana.