linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.61k stars 1.27k forks source link

Egress HTTPS Metrics #3190

Open grampelberg opened 5 years ago

grampelberg commented 5 years ago

What problem are you trying to solve?

The rich metrics that Linkerd provides are rarely available for third party services such as github.com because the communication is encrypted from the application all the way to the third party service. The proxy never sees the unencrypted bits.

There should be some solution that allows the proxy to inspect outbound traffic from an application to a third party (or anything outside the mesh), export metrics for that communication and apply policy via service profiles.

Requirements

Any alternatives you've considered?

grampelberg commented 5 years ago

Related #2192.

DavidZisky commented 4 years ago

+1 for this

championshuttler commented 4 years ago

Hey @grampelberg ! I am willing to get mentored for this issue for the Community Bridge program. Can you please help me in getting started with it?

Thanks

vaniisgh commented 4 years ago

Hey! I have a few doubts regarding this issue :) I think this is really interesting and would love to work on this in any capacity. I wanted to ask you that if we don't use say a traditional TLS or MITM approach (i.e we aren't decrypting the requests and forwarding them to the server). Does our approach kind of err on the side of guesswork, like say using the IP address of the outbound traffic and using a DNS to guess where the request is going and measuring the size of the query and response along with other metrics that would be available without decryption and knowledge of the exact contents of the request?

also does Must fail encrypted when not in the mesh. mean that all the nodes that haven't been meshed must not have encrypted traffic?

upon first glance, the alternatives look easier to implement, possibly why this issue looks so appealing haha :)

grampelberg commented 4 years ago

First step would be to jump into slack. We've got a #contributors channel for all these questions =)

We'd like to get to know y'all a little first, so you'll want to do a couple contributions. Check out good first issue for a list of those.

After that, we'll want to get an RFC together. I'm happy to help you pull all the pieces together on that in slack =)

I wanted to ask you that if we don't use say a traditional TLS or MITM approach (i.e we aren't decrypting the requests and forwarding them to the server).

Applications will likely need to send us unencrypted traffic so that we can do analytics on it. The other option would be some of the eBPF functionality around optimistic TLS. I don't believe that's mature quite enough yet and wouldn't hit the proxy anyways. To get unencrypted traffic, we'll likely want to have special domain names and ask application owners to change their connection strings.

wmorgan commented 4 years ago

There is a solution for this here: https://github.com/grampelberg/k8s-egress

m1o1 commented 2 years ago

Our security team wasn't really open to the idea of a single (egress) service receiving all the unencrypted egress traffic of the cluster. I think they would be more willing to consider it if it was the sidecar proxy doing it and the client application were to send it to localhost (in case it's not meshed, it would fail).

wmorgan commented 2 years ago

Egress control at the pod level is on the roadmap. Won't be in 2.12 itself but perhaps in 2.13? https://linkerd.io/2021/12/29/the-service-mesh-in-2022/

eloo-abi commented 1 year ago

Hi there, i was just searching for exactly this feature in linkerd :P

but it sadly looks this feature is not included in 2.13 as this is already released?

Is this feature still on the roadmap and can we estimate a milestone for this?

Thanks

channyein87 commented 1 year ago

Really need this feature to monitor the external traffic from the pod for observibility.

xsoheilalizadeh commented 7 months ago

This is a very practical feature for us, It would be great to see it in the next coming versions.

kflynn commented 7 months ago

This is still on the roadmap, although we don't have a definitive milestone for it yet. 😐

If any of y'all have a solid concept of what you think this should look like, we'd love to hear it – e.g. are you thinking of command-line support in linkerd viz? simply having metrics posted to Prometheus? splitting out metrics by egress destination? ???

zip-chanko commented 5 months ago

My current challenge is analyzing outgoing traffic from the cluster how the pods are accessing external resources such as internet, database, etc. Also there isn't any control for allowing whitelisted DNS domains since builtin network policy cannot do that. Currently using VPC flow logs but not really reliable when investigating connectivity issues like there were a lot of timeouts from one of the workloads calling to external resource which is behind a NLB and unable to trace with available telemetry data. We had too many assumptions whether the workloads itself or cluster networking or NLB.

adrian-gierakowski commented 4 months ago

Would be great to have visibility into amount of requests and success rates per external domain. Path level stats would be a great bonus