Open andrewl3wis opened 3 years ago
Thanks for your report @andrewl3wis. I'm wondering if filtering out the HA replica label name from the matchers received in the remote read request would fix the issue (and it's safe to do).
@gouthamve @cstyan Thoughts about it?
This is indeed a good point, and I think stripping the replica from the matchers might be safe, when both the cluster and replica label are there.
Having said that, I think this is better tackled in Prometheus itself, by being able to force some matchers being sent. Similar to we have now:
# An optional list of equality matchers which have to be
# present in a selector to query the remote read endpoint.
required_matchers:
[ <labelname>: <labelvalue> ... ]
We could add:
# An optional list of equality matchers which will be
# added when querying the remote read endpoint.
addtional_matchers:
[ <labelname>: <labelvalue> ... ]
Then users can set additional_matchers: replica: ''
to fix this. Thoughts @bwplotka?
Well I believe this idea is literally: https://github.com/prometheus/prometheus/issues/7992
It was rejected, but we can rethink this I would be happy to add this.
I don't get one thing, where you use Remote Read in Cortex?
I don't get one thing, where you use Remote Read in Cortex?
We don't use remote read directly in Cortex but we do support it, so you can configure Prometheus to read from Cortex.
Got it!
Yep, this is https://github.com/prometheus/prometheus/issues/7992, I would also consider rethinking the behavior in Prometheus itself.
I think you can work around this by adding {prometheus_replica=""}
to your query, or maybe {prometheus_replica=~".*"}
depending on your setup. Replacing prometheus_replica
with whatever you use as the deduplicating label.
So as the end user, my perspective is that because cortex is "eating" the tag due to HA deduplication that maybe it should live in the cortex ecosystem. Or at least a heads up around the HA config setup.
It's kind of a non-obvious gotcha that I think has bitten a few other folks based on messages in the Cortex slack
I am open to chasing this down in the prom ecosystem, but then I have to chase it down with the prom-operator to have reads strip tags off, as well as any other tools that speak to Prometheus natively.
I agree with you, Cortex should probably handle this since it is removing the HA label. I think I agree with the top comments that filtering out the HA label in remote read would be safe and the correct thing to do.
I have run into this same problem.
I have cortex running with HA enabled. My prometheus retention is set to 1h and remote read/write pointing to Cortex. When I try to query Prometheus for data older than 1h, I get empty results. This was working before I enabled the HA.
I have managed to find out and use the "filterExternalLabels" option under remoteReadSpec [https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#remotereadspec] . However, the queries now return two results - one with the external labels and one without any external label. Because of this, all my grafana charts do not display right visual data.
Is there an option similar to filterExternalLabels for Prometheus TSDB just like remoteRead? Or is there an option in Prometheus to always query remote storage and not local storage? (maybe set to tsdb retention to 0?)
Are we going to have a fix in Cortex anytime soon or are there any workarounds? PS - I do not want to update 100s of charts to exclude HA label from every query.
IIUC there is no one working on this atm. Feel free to open a pr to fix the behavior. I think I am ok to make it under a feature flag or make it the default behavior.
For temporary workaround, maybe you can check https://github.com/prometheus-community/prom-label-proxy. I am unsure if it supports remote read, but seems it doesn't. If it supports remote read API, it should be easy to inject the additional matchers for every query you have.
I have found the workaround solution. https://cortexmetrics.io/docs/guides/ha-pair-handling/#remote-read.
Describe the bug If HA dedup is enabled, downstream Prometheus can't read because it sends the read request with the HA pair's label.
This shows up as empty data in the Prometheus that is attempting to read the data.
If you force the HA label off at the Prometheus endpoint that is trying to remote read, it works fine.
In the kube-prometheus helm chart this is controlled with:
replicaExternalLabelNameClear: true
To Reproduce Steps to reproduce the behavior:
Expected behavior
To be able to read remotely when HA dedup is setup
Environment:
Storage Engine
It fails in both, we now run blocks.
Additional Context
We can work around this by the stripping the replica label off the remote label, but that then triggers a bunch of errors for prometheus not sending successfully to the remote target, which in turn triggers our alerting.