openzipkin-attic / docker-zipkin

Docker images for OpenZipkin
Apache License 2.0
687 stars 329 forks source link

[questing] Empty grafana dashboard #219

Open Igor-lkm opened 5 years ago

Igor-lkm commented 5 years ago

Problem: Dashboard 1598 is empty, data from zipkin is in Prometheus

Setup: k8s cluster with zipkin, prometheus (helm) and grafana (helm)

I am using zipkin-transport-http to send data to zipkin

Dashboard: https://grafana.com/grafana/dashboards/1598

gnetId: 1598
revision: 15

Zipkin deployment config on k8s with Prometheus scraping configuration:

      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '9411'
        prometheus.io/path: '/actuator/prometheus'
    spec:
      containers:
        - name: zipkin
          image: 'openzipkin/zipkin:2.15.0'

Example response produces by zipkin:9411/actuator/prometheus

zipkin_collector_messages_total{transport="http",} 223.0

Dashboard is empty:

Screenshot 2019-08-05 at 11 20 42

Data is in prometheus:

Screenshot 2019-08-05 at 11 20 58

Example: Messages received per transport

Example query from dashboard:

sum by(transport)(rate(zipkin_collector_messages_total{job="zipkin",instance=~"$instances"}[$__interval]))

And seems i do not have job="zipkin",instance=~"$instances"

So this would work ok:

sum by(transport)(rate(zipkin_collector_messages_total{}[5m]))

In table view looks like this:

Screenshot 2019-08-05 at 11 50 28

Job: kubernetes-pods

Seems somehow job and instance are missed 🤔 Maybe I am missing something?

Thanks for the help!

Added:

Works ok with different labels.

Looks like it is a broad question at the end.

For example RabbitMQ has metrics like rabbitmq_node_mem_used, prefixed with rabbitmq (dashboard 4279)

codefromthecrypt commented 5 years ago

@mstaalesen this isnt exclusively something you can help with, but do you have some time to help with this question?

On Mon, Aug 5, 2019, 7:32 PM Igor Likhomanov notifications@github.com wrote:

Problem: Dashboard 1598 is empty, data from zipkin is in Prometheus

Setup: k8s cluster with zipkin, prometheus (helm) and grafana (helm)

I am using zipkin-transport-http to send data to zipkin

Dashboard: https://grafana.com/grafana/dashboards/1598

gnetId: 1598

revision: 15

Zipkin deployment config on k8s:

  annotations:

    prometheus.io/scrape: 'true'

    prometheus.io/port: '9411'

    prometheus.io/path: '/actuator/prometheus'

spec:

  containers:

    - name: zipkin

      image: 'openzipkin/zipkin:2.15.0'

Example response produces by zipkin:9411/actuator/prometheus

zipkin_collector_messages_total{transport="http",} 223.0

Dashboard is empty:

[image: Screenshot 2019-08-05 at 11 20 42] https://user-images.githubusercontent.com/6287367/62453715-2fa56000-b773-11e9-9bbe-38a2584bf60a.png

Data is in prometheus:

[image: Screenshot 2019-08-05 at 11 20 58] https://user-images.githubusercontent.com/6287367/62453736-39c75e80-b773-11e9-8fe8-e579ea2548d7.png Example: Messages received per transport

Example query from dashboard:

sum by(transport)(rate(zipkin_collector_messages_total{job="zipkin",instance=~"$instances"}[$__interval]))

And seems i do not have job="zipkin",instance=~"$instances"

So this would work:

sum by(transport)(rate(zipkin_collector_messages_total{}[5m]))

Seems somehow job and instance are missed 🤔 Maybe I am missing something?

Thanks for the help!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openzipkin/docker-zipkin/issues/219?email_source=notifications&email_token=AAAPVV3SQ4PRQRSUFFRZI7TQC7XSFA5CNFSM4IJJIK42YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HDKV55A, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAPVV7TZGMH3EYQEXH66ZDQC7XSFANCNFSM4IJJIK4Q .

mstaalesen commented 5 years ago

Sorry for the late response (vacation).

Some of these parameters are based on what the labels are set for the scrape targets. If I recall correctly, when I modified the dashboard, it relied on instance and job. So I just kept it like that.

So to solve this in your case @Igor-lkm, you need to change the query from jobs="zipkin" to jobs="kubernetes-pods Instance is already present (shown in your last screenshot).

Igor-lkm commented 5 years ago

@mstaalesen yep, thanks :)

So I did it:

Works ok with different labels.

The point was that it's not working out-of-the-box and if there is a way to make it work out-of-the-box 🤔

mstaalesen commented 5 years ago

Sorry, I missed that part!

One approach could be to just remove job="zipkin" from the queries. It should still work, as the zipkin-metrics have a rather unique names.

Igor-lkm commented 5 years ago

@mstaalesen If we remove job="zipkin", we would miss instances in the filter on top, as they depend on a job label:

  "label": "Zipkin instances",
  "multi": true,
  ...
  "query": "label_values(http_server_requests_seconds_bucket{job=\"zipkin\"}, instance)",

So seems to remove job="zipkin" is not an option 🤔

If we would be able to remove job="zipkin" and still get an instances list, that would solve the problem 🤔

@mstaalesen Is there a github repo with this board, so i can contribute if I would have a good idea? :)

codefromthecrypt commented 5 years ago

there is no github repo. you can download and change it until we figure out what normal is wrt change management of grafana dashboards

https://grafana.com/dashboards/1598

cc @abesto

On Fri, Aug 23, 2019 at 12:19 AM Igor Likhomanov notifications@github.com wrote:

@mstaalesen https://github.com/mstaalesen If we remove job="zipkin", we would miss instances in the filter on top, as they depend on a job:

"label": "Zipkin instances",

"multi": true,

...

"query": "label_values(http_server_requests_seconds_bucket{job=\"zipkin\"}, instance)",

So seems to remove job="zipkin" is not an option 🤔

If we would be able to remove job="zipkin" and still get an instances list, that would solve the problem 🤔

@mstaalesen https://github.com/mstaalesen Is there a github repo with this board, so i can contribute if I would have a good idea? :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openzipkin/docker-zipkin/issues/219?email_source=notifications&email_token=AAAPVV3UZJZYHY67XIFRAITQF24B3A5CNFSM4IJJIK42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD45TX4Q#issuecomment-523975666, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAPVVY6HYQKMSP5Z5WZPO3QF24B3ANCNFSM4IJJIK4Q .

abesto commented 5 years ago

there is no github repo. you can download and change it until we figure out what normal is wrt change management of grafana dashboards

That's correct, I don't know of a standard way to properly version-control dashboards hosted on grafana.com. A quick search turns up https://git.abolivier.bzh/babolivier/grafana-dashboards-manager, but it's anybodies guess whether that actually adds value over "use a Git repo, and an admin sometimes syncs to grafana.com".

codefromthecrypt commented 5 years ago

/me chuckles

mstaalesen commented 5 years ago
  "query": "label_values(http_server_requests_seconds_bucket{job=\"zipkin\"}, instance)",

I think the "workaround" on that would be to just use a metric that we know is unique to zipkin, or if that doesn't work - add a regexp to filter out only zipkin instances.

With regards to how to version control it. There really is no good way of doing it as of right now. I think there is some work being done where you save your dashboard as code. I've seen some talks on the topic, but I don't think its mature enough to be used by "the masses" yet.