go-graphite / carbonapi

Implementation of graphite API (graphite-web) in golang
Other
308 stars 140 forks source link

[BUG] Fail to render metric in victoriametrics #514

Closed colixxx closed 4 years ago

colixxx commented 4 years ago

Describe the bug Hi, I have carbonapi and victoriametrics. I try get metrics in grafana. I create datasource on carbonapi. Metrics locate on victoriametrics. Whet I try get metrics in Explore Grafana I have success render only whet time range is "Last 5 minutes" Grafana 2020-08-07 15-21-07 or when time in time range have :00 in seconds by start. Grafana 2020-08-07 15-21-59

But when time range have :13 seconds, for example, I not see graphic. Grafana 2020-08-07 15-24-07

In log I can't see trouble

Logs

2020-08-07T11:59:57.174Z        DEBUG   zipper  will try to fetch data  {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"]}
2020-08-07T11:59:57.174Z        DEBUG   zipper  single fetch    {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "client": {}}
2020-08-07T11:59:57.174Z        DEBUG   zipper  waiting for slot        {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "prometheus", "max_connections": 0}
2020-08-07T11:59:57.174Z        DEBUG   zipper  got slot        {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "prometheus"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  sending request {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "prometheus"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  will try to fetch data  {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"]}
2020-08-07T11:59:57.174Z        DEBUG   zipper  single fetch    {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "client": {}}
2020-08-07T11:59:57.174Z        DEBUG   zipper  waiting for slot        {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "http://victoriametrics-victoria-metrics-single-server:8428", "max_connections": 0}
2020-08-07T11:59:57.174Z        DEBUG   zipper  got slot        {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "http://victoriametrics-victoria-metrics-single-server:8428"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  sending request {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "http://victoriametrics-victoria-metrics-single-server:8428"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  got some target to query        {"type": "prometheus", "protocol": "prometheus", "name": "http://victoriametrics-victoria-metrics-single-server:8428", "type": "fetch", "request": "&MultiFetchRequest{Metrics:[{stats.timers.python-mcs-test.container.external.redis.get.success.count 1596747600 1596834000 false stats.timers.python-mcs-test.container.external.redis.get.success.count [] 0}],}", "pathExpr": "stats.timers.python-mcs-test.container.external.redis.get.success.count", "target": "stats.timers.python-mcs-test.container.external.redis.get.success.count"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  will do query   {"type": "prometheus", "protocol": "prometheus", "name": "http://victoriametrics-victoria-metrics-single-server:8428", "type": "fetch", "request": "&MultiFetchRequest{Metrics:[{stats.timers.python-mcs-test.container.external.redis.get.success.count 1596747600 1596834000 false stats.timers.python-mcs-test.container.external.redis.get.success.count [] 0}],}", "query": "{__name__=~\"stats\\\\.timers\\\\.python-mcs-test\\\\.container\\\\.external\\\\.redis\\\\.get\\\\.success\\\\.count\"}", "start": 1596747600, "stop": 1596834000, "step": "15s"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  trying to get slot      {"type": "prometheus", "protocol": "prometheus", "name": "http://victoriametrics-victoria-metrics-single-server:8428", "type": "fetch", "request": "&MultiFetchRequest{Metrics:[{stats.timers.python-mcs-test.container.external.redis.get.success.count 1596747600 1596834000 false stats.timers.python-mcs-test.container.external.redis.get.success.count [] 0}],}", "function": "HttpQuery.doRequest", "server": "http://victoriametrics-victoria-metrics-single-server:8428", "name": "http://victoriametrics-victoria-metrics-single-server:8428", "uri": "http://victoriametrics-victoria-metrics-single-server:8428/api/v1/query_range?query=%7B__name__%3D~%22stats%5C%5C.timers%5C%5C.python-mcs-test%5C%5C.container%5C%5C.external%5C%5C.redis%5C%5C.get%5C%5C.success%5C%5C.count%22%7D&start=1596747600&step=15s&stop=1596834000", "name": "http://victoriametrics-victoria-metrics-single-server:8428"}
2020-08-07T11:59:57.174Z        DEBUG   zipper  got slot for server     {"type": "prometheus", "protocol": "prometheus", "name": "http://victoriametrics-victoria-metrics-single-server:8428", "type": "fetch", "request": "&MultiFetchRequest{Metrics:[{stats.timers.python-mcs-test.container.external.redis.get.success.count 1596747600 1596834000 false stats.timers.python-mcs-test.container.external.redis.get.success.count [] 0}],}", "function": "HttpQuery.doRequest", "server": "http://victoriametrics-victoria-metrics-single-server:8428", "name": "http://victoriametrics-victoria-metrics-single-server:8428", "uri": "http://victoriametrics-victoria-metrics-single-server:8428/api/v1/query_range?query=%7B__name__%3D~%22stats%5C%5C.timers%5C%5C.python-mcs-test%5C%5C.container%5C%5C.external%5C%5C.redis%5C%5C.get%5C%5C.success%5C%5C.count%22%7D&start=1596747600&step=15s&stop=1596834000", "name": "http://victoriametrics-victoria-metrics-single-server:8428"}
2020-08-07T11:59:57.194Z        DEBUG   zipper  got response    {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "http://victoriametrics-victoria-metrics-single-server:8428"}
2020-08-07T11:59:57.194Z        DEBUG   zipper  got some fetch responses        {"type": "broadcastGroup", "groupName": "prometheus", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backends_count": 1, "response_count": 1, "have_errors": false, "errors": null, "response_count": 1}
2020-08-07T11:59:57.194Z        DEBUG   zipper  got response    {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backend_name": "prometheus"}
2020-08-07T11:59:57.194Z        DEBUG   zipper  got some fetch responses        {"type": "broadcastGroup", "groupName": "root", "type": "fetch", "request": ["stats.timers.python-mcs-test.container.external.redis.get.success.count"], "backends_count": 1, "response_count": 1, "have_errors": false, "errors": null, "response_count": 1}
2020-08-07T11:59:57.195Z        INFO    access  request served  {"data": {"handler":"render","carbonapi_uuid":"c5e6c50c-dceb-4a6b-8ef5-d6aaac6e105e","url":"/render","peer_ip":"10.60.14.52","host":"carbonapi:8081","format":"json","use_cache":true,"targets":["stats.timers.python-mcs-test.container.external.redis.get.success.count"],"cache_timeout":60,"metrics":["stats.timers.python-mcs-test.container.external.redis.get.success.count"],"runtime":0.021788468,"http_code":200,"carbonzipper_response_size_bytes":46253,"carbonapi_response_size_bytes":19027,"from":1596747600,"until":1596834000,"max_data_points":1380,"from_raw":"1596747600","until_raw":"1596834000","uri":"/render","from_cache":false,"used_backend_cache":false,"request_headers":{}}}

CarbonAPI Configuration: carbonapi.yaml

listen: "0.0.0.0:8081"
concurency: 1000
cache:
   type: "mem"
   size_mb: 0
   defaultTimeoutSec: 60
   memcachedServers:
       - "127.0.0.1:1234"
       - "127.0.0.2:1235"
cpus: 0
tz: ""
headersToPass:
  - "X-Dashboard-Id"
  - "X-Grafana-Org-Id"
  - "X-Panel-Id"
# functionsConfig:
#     graphiteWeb: ./graphiteWeb.example.yaml
maxBatchSize: 0
graphite:
    host: ""
    interval: "60s"
    prefix: "carbon.api"
    pattern: "{prefix}.{fqdn}"
idleConnections: 10
pidFile: ""
upstreams:
    buckets: 10

    timeouts:
        find: "2s"
        render: "10s"
        connect: "200ms"
    concurrencyLimitPerServer: 0

    keepAliveInterval: "30s"
    maxIdleConnsPerHost: 100
    backendsv2:
        backends:
          -
            groupName: "prometheus"
            protocol: "prometheus"
            lbMethod: "broadcast"
            maxTries: 3
            maxBatchSize: 0
            keepAliveInterval: "10s"
            concurrencyLimit: 0
            maxIdleConnsPerHost: 1000
            # backendOptions:
            #     step: "1"
            #     start: "-5m"
            #     #max_points_per_query: 5000
            timeouts:
                find: "2s"
                render: "50s"
                connect: "200ms"
            servers:
                - "http://victoriametrics-victoria-metrics-single-server:8428"
    graphite09compat: false
# graphTemplates: graphTemplates.example.yaml
expireDelaySec: 10
logger:
    - logger: ""
      file: "stderr"
      level: "debug"
      encoding: "console"
      encodingTime: "iso8601"
      encodingDuration: "seconds"
    - logger: ""
      file: "carbonapi.log"
      level: "info"
      encoding: "json"
Civil commented 4 years ago

This looks like a duplicate for #504

It would be great if you can have a look through that bug and verify that you have the same issue as described there.

colixxx commented 4 years ago

@Civil this is a similar problem. But I have table with start

When I curl curl "http://victoriametrics-victoria-metrics-single-server:8428/api/v1/query_range?query=%7B__name__%3D~%22stats%5C%5C.timers%5C%5C.python-mcs-test%5C%5C.container%5C%5C.external%5C%5C.redis%5C%5C.get%5C%5C.success%5C%5C.count%22%7D&start=1596790865&step=60s&end=1596798120"

first value have timestamp is 1596790860

But when I curl curl "http://victoriametrics-victoria-metrics-single-server:8428/api/v1/query_range?query=%7B__name__%3D~%22stats%5C%5C.timers%5C%5C.python-mcs-test%5C%5C.container%5C%5C.external%5C%5C.redis%5C%5C.get%5C%5C.success%5C%5C.count%22%7D&start=1596790860&step=60s&end=1596798120"

first value have timestamp is 1596790860

maybe grafana can't render such an answer. when the time range in response exceeds the time range in request...

I also noticed that stop returns all points to the end, and end takes into account the end of the time interval

alexandv commented 4 years ago

@radinv The issue is that VictoriaMetrics aligns the start/stop timestamps with the step parameter when more than 50 values are being requested. It does that to allow caching. This causes carbonapi to set all values to NaN. This could be fixed by aligning the timestamps in carbonapi the same way VictoriaMetrics does it.

To workaround you can disable the cache in VictoriaMetrics using this flag: --search.disableCache.

CarbonAPI already does some caching I think.

colixxx commented 4 years ago

@alexandv I try use flag --search.disableCache. But it not work... I not understand, how I can aligning the timestamps in carbonapi the same way VictoriaMetrics does it?

alexandv commented 4 years ago

Try updating to the latest version of VictoriaMetrics. I tried it with 1.39.3 and it worked for me with -search.disableCache flag.

Civil commented 4 years ago

Could you please verify if that's fixed in current master? (Fix is in the commit linked here)