jacksontj / promxy

An aggregating proxy to enable HA prometheus
MIT License
1.14k stars 129 forks source link

422 Unprocessable Entity on random queries #687

Open paulojmdias opened 4 days ago

paulojmdias commented 4 days ago

Hi,

I'm seeing some random HTTP 422 in some random queries that come from Grafana. I have done 20+ simultaneous curl requests, and randomly I see those HTTP 422. Even with debug logs enabled, I don't see anything wrong.

What can cause this ? Anything can be improved on the configuration?

image
promxy:
  server_groups:
  - http_client:
      dial_timeout: 1s
      tls_config:
        insecure_skip_verify: true
    http_headers:
      X-Scope-OrgID: org1|org2
    path_prefix: /prometheus
    remote_read: false
    scheme: https
    static_configs:
    - targets:
      - dns.domain.com:8080

On Grafana side I see HTTP 500, which is weird since I don't see it on Promxy logs.

image

jacksontj commented 3 days ago

Hmm, that is curious -- haven't seen that error before. Would it be possible to get debug or trace logs? Alternatively a tcpdump would be great (but a bit hard to anonymize).

The grafana side error says "error in servergroup ord=0" which really sounds like promxy returned an error (422 or 500 -- unclear since they disagree). If we could get some logs or tcpdumps from promxy we could get some better idea of what happened there.

A 422 code means the request is bad -- so maybe there is some encoding or escaping issue with the subqueries being sent to the downstreams (maybe promxy is getting 422s from downstream and returning a 500)?

paulojmdias commented 2 days ago

@jacksontj I will enable trace logs next week and I will share them here.