Open adleong opened 1 year ago
Thanks for raising this, @ngc4579. It's possible that additional AuthorizationPolicies are needed for Prometheus federation. This will require some investigation.
This policy was suggested by Michelle B on the Linkerd Slack (link will expire in 90 days):
apiVersion: policy.linkerd.io/v1alpha1
kind: AuthorizationPolicy
metadata:
name: prometheus-admin-federate
namespace: linkerd-viz
spec:
targetRef:
group: policy.linkerd.io
kind: Server
name: prometheus-admin
requiredAuthenticationRefs:
- group: policy.linkerd.io
kind: NetworkAuthentication
name: kubelet
Thanks so much @adleong @wmorgan for your answers. The mentioned AuthorizationPolicy
actually did help, federation works as expected now. If this policy is intentionally required, I guess this should be reflected in the docs. (Or else, if it already is, it seems I wasn't able to find it. :) )
We have setup the linkerd-viz with external prometheus and after the upgrade we are getting following errors
time="2023-06-26T12:34:55Z" level=error msg="queryProm failed with: Query failed: \"sum(increase(response_total{deployment=\\\"app-prod-http\\\", direction=\\\"outbound\\\", namespace=\\\"web\\\"}[1m])) by (dst_namespace, dst_deployment, classification, tls)\": Post \"https://external-endpoint/api/v1/query\": context canceled"
Anybody would like to submit a PR with this policy included? Should be pretty straight-forward.
@prajithp13 Did you apply the policy?
@alpeb I'd like to pick this up, I'm learning Linkerd and service meshes in general, would also like to contribute to the project, this seems like a good issue to start with.
@deepto98 sounds great, please proceed!
@deepto98 Are you working on this? If not, I will be willing to tackle this issue :)
I'll pick this up this week
Did a PR for this issue ever get created?
Hey is there any progress on this issue?
@ioannatheo there is a workaround by adding that policy YAML pasted earlier above. A PR to add that by default would be welcome.
I am actively working on this. I think I have a pretty good understanding on what needs to be done. Track progress: https://github.com/francRang/linkerd2 Give me 1-2 days max and I should be able to get it ready for review.
I am assuming @adleong used the helm chart. I used the helm chart and am seeing the same issue.
The kubelet NetworkIdentity is meant for probes from kubelet. The default definition provided in the helm chart is a catch-all (everything will match it) so we are letting everything in. It might work as a workaround only because it is a catch-all (not an effective identity).
Discussed in https://github.com/linkerd/linkerd2/discussions/11044