wish / katalog-sync

A reliable node-local mechanism for syncing k8s pods to consul services
MIT License
36 stars 7 forks source link

NodePort and node ip #29

Open danielmotaleite opened 5 years ago

danielmotaleite commented 5 years ago

i'm trying to use this and i'm having several problems, so this is a question/bug/feature request ticket. I can create different issues for each bug/feature request, but lets first find what is a bug or missing feature or bad user.

The first one is that i can only register in consul a deployment/pods, when trying to register a service, nothing show up in consul... so does this support services registration? can it be added?

This is what i'm trying to do:

apiVersion: v1
kind: Service
metadata:
  name: thumbor
  namespace: thumbor-dev
  labels:
    run: thumbor
  annotations:
    katalog-sync.wish.com/service-names: thumbor-nodeport
spec:
  selector:
    app: thumbor
  type: NodePort
  ports:
    - port: 9900
      nodePort: 31111

The second problem is that, using pod registration, in consul dns the pods show with the internal kubernetes ip, This is ok for internal kubernetes access, but i need to use this from outside kubernetes, i need the NodeIP and NodePort ... as nodeport is managed by service, this is probably the above request... or i'm missing something?

With a daemonset, using hostPort, i get the correct IP and ports, so this part works fine

thanks for the help

jacksontj commented 5 years ago

First off, thanks for reaching out! Always happy to see people using our software :)

The first one is that i can only register in consul a deployment/pods, when trying to register a service, nothing show up in consul... so does this support services registration?

katalog-sync is a node-local sync service, as such it only uses the kubelet API which has no information about what service a pod is. That being said we could theoretically add it by having each node watch the services and then use the matchers to determine if any local pods match the service. This would require some work and refactoring but is definitely within the realm of possibility.

The second problem is that, using pod registration, in consul dns the pods show with the internal kubernetes ip,

This is another interesting one, kubernetes' networking is completely pluggable so some networks are directly routable and others aren't. In this case it sounds like you have an overlay network and are trying to sync the NodePort services into consul (not the pods themselves)? Meaning you want all N nodes in the cluster with the NodePort in the service -- right? If so, thats also something that could be added, but hasn't been a priority (since we run a non-overlay k8s cluster).

danielmotaleite commented 5 years ago

Thanks for the reply and the app, i think it have a way better design than the consul-sync, more in line with the hashicorp tools and a unix design, with one service per node to be completely redundant.

That being said we could theoretically add it by having each node watch the services and then use the matchers to determine if any local pods match the service.

So for the first one, yes, i would love to be able to map a service only to the nodes that really have those app service running... i dislike the kubernetes design where any traffic can reach the service in any node and then redirect to the correct node, where the app is running. Google design it like that because they can waste hardware.

If i already know where the app is running, i could use that node directly, no need to overload other machines network/system + iptables+snat with unneeded traffic and higher latency. Some services are light, other are heavy and being able to choose how the traffic reach a app is a good thing. Katalog+consul looks like a perfect way to workaround the google design and could be a killer feature for bare-metal kubernetes setup.

In this case it sounds like you have an overlay network

for the second one, not exactly that... maybe it was trying to workaround a bug or a missing feature and wrongly tried to workaround it let me go back to what i want to deploy so you better understand what i mean:

I have a thumbor service, to resize image and i need a external kubernetes service to connect to then (a varnish). While i could use a nginx-ingress, i'm trying to bypass it, i really do not yet need another layer, varnish and thumbor work fine together. I actually have another service that can't even work with nginx-ingress that also needs this. But anyway, lets keep this just to thumbor, to be simpler!

As thumbor is single thread, i need several of then running on the same node, so i can't really use a daemonset, but trying to use a deployment with hostport or a service with nodeport, katalog fails to register the node ip, it only registers the pod ip (even if associated with the connect node consul, but i can't fetch that node ip). so varnish can't use that consul dns as the pod ip is in the kubernetes network. If consul had the node IP, i could connect to the nodes+hostports and talk directly to thumbor ( or via a service with nodeport if the first one exists)

I would like to be able to either flag katalog to register the node ip instead of pod ip or automatically detect a host port and register the node IP for that app. For apps with both kubernetes IPs and hostport , or a sidecar with any port combination, it would be tricky to know what IP to use with what port, so the last option seems harder.

Probably the best solution is a new option, say katalog-sync.wish.com/service-nodeip-SERVICE-NAME, a boolean (1/0 or true/false... or both), and this option would flag katalog if it should register the node ip or the pod ip for that service-name.

hope it is clear now the 2 issues i have hit, that look like are really 2 new feature requests! :) do you want me to move the second one to a new ticket to keep one feature per issue?

Again, thanks for katalog-sync!

jacksontj commented 5 years ago

For #1 I've created https://github.com/wish/katalog-sync/issues/32 -- we'll need to work through some specifics around failure domain, but thats definitely a possibility (although likely not a priority for a while, but PRs are always welcome).

As for this second issue, I think I understand the issue you have. So it sounds like your pod IPs aren't routable externally (likely because you are using some overlay network). For example, if you use the aws-cni plugin (EKS clusters use it) then the pod IPs are just IPs in the VPC -- so they are all directly routable. In this direct-routing case we don't ever want the NodeIP since the PodIP is directly routable. So in your situation you want the nodeIP for any node that is running a pod in a service. This could be doable (probably some additional flag on #32) although it'll likely be a bit odd as its not a 1-1 mapping (you might have N pods on the same node, but we'd have to add it only once). Effectively this would have the same effect/issue as setting externalTrafficPolicy = Local (which might be something for you to look into).