cncf / demo

Demo of CNCF technologies
https://cncf.io
Apache License 2.0
77 stars 39 forks source link

Integrate LinkerD #192

Open ronaldpetty opened 7 years ago

ronaldpetty commented 7 years ago

Need to update demo app to include LinkerD usage.

namliz commented 7 years ago

I'd be very interested in a discussion of demarcation lines between k8s and LinkerD features. There seems to be a bit of overlap and I'm currently mulling this over.

leecalcote commented 7 years ago

I am as well.

dankohn commented 7 years ago

@wmorgan Could you please offer some suggestions for incorporating a linkerd demo into the code. We're particularly interested in uses that work with our sample applications but are not possible with vanilla K8s.

wmorgan commented 7 years ago

Generally speaking, the demarcation is at request-level, as opposed to connection level, features. E.g. K8s provides layer 3/4 load balancing, which allows you to balance connections across a service. Linkerd provides request-level load balancing, which allows you to balance requests across a service. Request level is specific to the protocol, but allows you to do fancy things like pick destinations based on latency or error rates, propagate context metadata, etc. (whereas connection level can only pick based on reachability).

We have a couple blog posts that may be helpful understanding some of the use cases:

  1. Service metrics (success rates, latencies, etc): https://blog.buoyant.io/2016/10/04/a-service-mesh-for-kubernetes-part-i-top-line-service-metrics/
  2. TLS between nodes without the application being involved: https://blog.buoyant.io/2016/10/24/a-service-mesh-for-kubernetes-part-iii-encrypting-all-the-things/
  3. Percentage-based blue-green deploys: https://blog.buoyant.io/2016/11/04/a-service-mesh-for-kubernetes-part-iv-continuous-deployment-via-traffic-shifting/
  4. Using different services on a per-request basis: https://blog.buoyant.io/2016/11/18/a-service-mesh-for-kubernetes-part-v-dogfood-environments-ingress-and-edge-routing/
  5. Distributed tracing: https://blog.buoyant.io/2017/03/14/a-service-mesh-for-kubernetes-part-vii-distributed-tracing-made-easy/

Those blog posts all have working examples of layering Linkerd + K8s. The first one also uses Prometheus. We have an upcoming blog post that will show off gRPC.

Does that help?

namliz commented 7 years ago

These are great, the first and last posts in that list crystalized some things for me. I'm going to trying some ideas out and update this issue.

wmorgan commented 7 years ago

Putting a couple notes here around an email exchange w/@zilman:

HTTP load generation from one set of pods against a horizontally scalable service endpoint + some CPU intensive background jobs. I've actually explored auto scaling the endpoint (which is DDOS'ed essentially) and it is no trivial matter to get right with the built in auto-scaler + custom metrics. Lots of intuition based fiddling.

Yes, this is the problem with autoscaling by CPU! It's a bad metric to use because cpu utilization is very indirectly correlated with "we need more bandwidth" for I/O services. One thing that's been on the blog post todo list for us for a looong time is autoscaling based on latency. Linkerd service latency => Prometheus aggregation => k8s autoscaling. That would be a cool demo and probably require way less tuning.

I think with LinkerD I can modify the demo as such: induce degraded performance for some of these pods (by allocating CPU intensive background work unequally to some nodes and causing resource contention on purpose, hopefully I can induce laggy responses like that) and demonstrate how the load is spread better, and thus the auto scaling being more efficient.

Even without autoscaling, you should be able to demonstrate that using Linkerd can decrease end-to-end latency under conditions of unequal endpoint latency. See e.g. https://blog.buoyant.io/2017/01/31/making-things-faster-by-adding-more-steps/