kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.23k stars 39.44k forks source link

Iptables forwarding routes for ClusterIP services #27161

Closed paralin closed 7 years ago

paralin commented 8 years ago

I have the following use case:

I need to route from non-kubernetes servers to Kubernetes ClusterIPs. I already have the routes set up properly to FORWARD to pod IP addresses, and this works perfectly (I can access pod IP addresses from outside the cluster). But nothing I've tried seems to get the service routing to work properly.

How can I set this up properly? Is this something that potentially should be included in Kubernetes?

adohe-zz commented 8 years ago

Typically, services and pods have IPs only routable by the cluster network. IIUC you want to access services outside of cluster, this is what Ingress do, ptal.

paralin commented 8 years ago

@adohe I'm well familiar with ingress but it should be possible for other servers on the same private subnet as the cluster to route traffic to services.

As I mentioned before I already have pod routing working properly.

adohe-zz commented 8 years ago

/cc @thockin

chrislovecnm commented 8 years ago

I forgot that @paralin opened this.

Requirements:

Software applications Impacted:

Cassandra Use Case:

Specifically Cassandra instances CANNOT be load balanced. A client external to cluster must see every Cassandra IP address, and when doing multiple Datacenters between K8s deployments all nodes must talk to each other. Federated will assist, but since Cassandra provides amazing HA, I almost would recommend to have Cassandra to handle itself. A SeedProvider and SnitchProvider for Cassandra could utilize muitple K8s endpoints for discovery.

Cassandra routinely uses IP addresses instead of DNS, and with Petsets ... voila!

Ubernetes Multiple Cassandra Datacenters - every node must be able to talk to every other node. DC1 eight nodes, DC2 eight nodes. All 16 nodes must be able to lookup and talk to each other. External Clients - For instance the cassandra java driver needs to be able to communicate to every server in a C* ring. Going through a loadbalancer just does not fit multiple features that C* clients rely on.

Items Impacted:

@bgrant0607 who is the networking super guru that can assist. @mward29 pulling you in because of your aws / Cassandra awesomeness @bprashanth / @smarterclayton I think PetSet is fully ready for this 😄

cc @kubernetes/examples - do we have other use cases in apps?

Once this is vetted, I can put in a formal proposal after 1.3 launch ... till then I am SUPER busy

paralin commented 8 years ago

And if anyone here who is an amazing network guru wants to help me with my trivial little routing problem (unrelated to the Services routing)...

http://superuser.com/questions/1087888/ip-forwarding-from-one-server-through-default-route

chrislovecnm commented 8 years ago

Some notes from now closed: https://github.com/kubernetes/kubernetes/issues/27239

Per @paralin

OpenVPN. I use Babel for route redistribution, and since Babel can't talk over the GCE networks, I use Google Cloud Networking routes to define the IP CIDRs I have behind the VPN. However, in my "remote datacenter" babel gossip works so I have it running on all the nodes. It results in a very robust setup.

Babel:

https://www.irif.univ-paris-diderot.fr/~jch/software/babel/

It's a really well known routing algorithm.

chrislovecnm commented 8 years ago

@thockin I have been meaning to pester you about this 😀 Might you give us your two cents. I also know you are SUPER busy because of the 1.3 release...

thockin commented 8 years ago

As of a few weeks ago (I forget the release but it was a 1.2.x where x != 0) (https://github.com/kubernetes/kubernetes/pull/24429) we fixed the routing such that any traffic that arrives at a node destined for a service IP will be handled as if it came to a node port. This means you should be able to set yo static routes for your service cluster IP range to one or more nodes and the nodes will act as bridges. This is the same trick most people do with flannel to bridge the overlay.

It's imperfect but it works. In the future will will need to get more precise with the routing if you want optimal behavior (i.e. not losing the client IP), or we will see more non-kube-proxy implementations of services.

paralin commented 8 years ago

@thockin What if the source is from a node itself? Is it possible to create a static route to localhost?

thockin commented 8 years ago

When a node in the cluster accesses a cluster IP it is already captured.

On Sat, Jun 11, 2016 at 3:12 PM, Christian Stewart <notifications@github.com

wrote:

@thockin https://github.com/thockin What if the source is from a node itself? Is it possible to create a static route to localhost?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-225397324, or mute the thread https://github.com/notifications/unsubscribe/AFVgVGPhoVPD6cBPyza89o_a4c8sg6zxks5qKzLfgaJpZM4Iyhrg .

paralin commented 8 years ago

@thockin The case I'm running into is, that I'm tunneling some remote traffic through a VPN, and I want cluster IPs to route through there properly. It seems that pod IPs route fine, but cluster IPs do not.

I'll do some more experimenting and get back to you if I can't figure it out still. Have a good vacation!

thockin commented 8 years ago

What we set up fro GCE customers is a static route for the whole service IP range to each of the nodes (ECMP). Traffic through the VPN is subject to those routing rules, arrives at a random node, is recognized as coming from "outside this cluster" and is proxied to the Service IP.

On Sat, Jun 11, 2016 at 3:18 PM, Christian Stewart <notifications@github.com

wrote:

@thockin https://github.com/thockin The case I'm running into is, that I'm tunneling some remote traffic through a VPN, and I want cluster IPs to route through there properly. It seems that pod IPs route fine, but cluster IPs do not.

I'll do some more experimenting and get back to you if I can't figure it out still. Have a good vacation!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-225397560, or mute the thread https://github.com/notifications/unsubscribe/AFVgVKbTtinKx6aRbwZhAAugtTWz4d0Cks5qKzQpgaJpZM4Iyhrg .

chrislovecnm commented 8 years ago

@thockin lol - so what cloud independent solution would you recommend? Soup to nuts ;P

paralin commented 8 years ago

@chrislovecnm Routing rules are not specific to GCE, he's just saying, route the traffic to the service IPs to any of the kubernetes nodes, and they will forward them properly. The problem I'm running into is that babel doesn't want to redistribute the route properly it seems.

chrislovecnm commented 8 years ago

all I know is that I am staying outta this and letting you network gurus work it out :P

chrislovecnm commented 8 years ago

Just ELIF when you all work it out ;)

thockin commented 8 years ago

The problem is that networking is VERY customized to every site, so it's hard to have a soup-to-nuts anything. Whatever your routing infrastructure is, you can set up static routes that way. It might be a conf file that configures ip route in your cluster nodes, or it might be a BGP config, or it might be some other vendor-centric control plane.

thockin commented 8 years ago

For GCE specifically:

chrislovecnm commented 8 years ago

Encrypted?

chrislovecnm commented 8 years ago

I am guessing traffic will go Internet?

paralin commented 8 years ago

@chrislovecnm This is simple IP routing, and on GCE, you can route between network locations seamlessly, it's just about telling the nodes in one region how to get to the pods in the other. I can walk you through it.

thockin commented 8 years ago

Not encrypted, no internet. It only affects traffic already inside the perimiter

paralin commented 8 years ago

The solution (thanks @thockin !!)

Check the route on one of your nodes:

$ ip route get 10.0.0.1
10.0.0.1 via 10.128.0.1 dev eth0  src 10.128.0.3

Add this route on your remote device, where tap0 is your device that can "see" the node, and the IP is the ip address of the node:

ip route add 10.0.0.0/16 dev tap0 via 10.5.0.1 proto static

Works great.

@chrislovecnm Sorry to do this to you but can you open your old issue again?

thockin commented 8 years ago

This was not exactly what I meant. I meant to use GCE's static routing API (when in GCE, obv).

On Sat, Jun 11, 2016 at 5:57 PM, Christian Stewart <notifications@github.com

wrote:

The solution (thanks @thockin https://github.com/thockin !!)

Check the route on one of your nodes:

$ ip route get 10.0.0.1 10.0.0.1 via 10.128.0.1 dev eth0 src 10.128.0.3

Add this route on your remote device, where tap0 is your device that can "see" the node, and the IP is the ip address of the node:

ip route add 10.0.0.0/16 dev tap0 via 10.5.0.1 proto static

Works great.

@chrislovecnm https://github.com/chrislovecnm Sorry to do this to you but can you open your old issue again?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-225402965, or mute the thread https://github.com/notifications/unsubscribe/AFVgVHMD57uEEaBu60Kz7uWttfb9x8jrks5qK1lggaJpZM4Iyhrg .

chrislovecnm commented 8 years ago

@paralin lets reopen this amigo 😄 LOL ... I close one you ask to repopen ... You close this ... I ask to reopen.

chrislovecnm commented 8 years ago

@thockin does get us backend traffic over google wire not internet? Or should we use gce vpn?

ghost commented 8 years ago

Yes, stays within GCE as long as same virtual Network. No public anything, unless I deeply misunderstand On Jun 12, 2016 10:41 AM, "Chris Love" notifications@github.com wrote:

@thockin https://github.com/thockin does get us backend traffic over google wire not internet?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-225404435, or mute the thread https://github.com/notifications/unsubscribe/ALNCD6gohdjF-ZDdb57Rg0cl23Pu_C8Nks5qK2O8gaJpZM4Iyhrg .

paralin commented 8 years ago

@chrislovecnm why reopen? @thockin-cc I understand exactly what you're saying and will walk @chrislovecnm through it / document it down the line.

chrislovecnm commented 8 years ago

No worries I guess yah liked the other issue better

ghost commented 8 years ago

Roger. Thanks! On Jun 12, 2016 12:00 PM, "Christian Stewart" notifications@github.com wrote:

@chrislovecnm https://github.com/chrislovecnm why reopen? @thockin-cc https://github.com/thockin-cc I understand exactly what you're saying and will walk @chrislovecnm https://github.com/chrislovecnm through it / document it down the line.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-225406912, or mute the thread https://github.com/notifications/unsubscribe/ALNCD8Chr2grcK_Mf2tRjTjiJGPDbQD3ks5qK3ZPgaJpZM4Iyhrg .

ashmere commented 8 years ago

@thockin is there any public docs on your gce soution as this is what I'm struggling with making the cluster service range available over a gce vpn.

chrislovecnm commented 8 years ago

@thockin should we reopen this?

thockin commented 8 years ago

@roberthbailey @cjcullen Do we have published docs on the services routing config "trick" ?

@ashmere It was broken for a couple of 1.2.x releases and was fixed in https://github.com/kubernetes/kubernetes/pull/24429

ashmere commented 8 years ago

@thockin I'm using v1.3.0 are we should it is fixed I get odd results some connections work others don't

simple example of connecting to kubernetes service ip from another gce project connected via gcp vpn

while true; do curl -k --max-time 5 https://<service range>.1/; sleep 2; done;
Unauthorized
Unauthorized
curl: (28) Connection timed out after 5000 milliseconds
Unauthorized
curl: (28) Connection timed out after 5000 milliseconds
Unauthorized
Unauthorized
chrislovecnm commented 8 years ago

Can someone reopen this please :)

thockin commented 8 years ago

@ashmere Can you share the static routes you set up?

ashmere commented 8 years ago

We are using coreos (stable) and v1.3.0 in gce with a custom cluster setup (i.e. not saltstack) with flannel networking (using gce backend). I've configured a static route using gce api for the service ip range with a next hop of the one of the kube-masters (which is running the kube-proxy) priority 200.

Update: just updated to v1.3.2 same issue

thockin commented 8 years ago

@ashmere that's a complex setup. why are you using flannel on GCE? The node controller already knows how to handle GCE routing. It smells like duplicate routes, to me.

ashmere commented 8 years ago

I don't appear to have duplicate gce routes. but going to try running without flannel.

Think I must be doing something wrong when I try to stripe flannel out

kubelet.go:2479] skipping pod synchronization - [ConfigureCBR0 requested, but PodCIDR not set. Will not configure CBR0 right now]

all the containers appear to be running using the default docker networking 172.17.x.x

Passing on the kubelet options --reconcile-cidr=true, --configure-cbr0=true and on the controller-manager --allocate-node-cidrs=true, --configure-cloud-routes=true

Update: fixed this issue with the CBR. by switching to kubenet with cni plugin

This hasn't changed the issue with connecting to the service IPs though

thockin commented 8 years ago

So for status, you can now get proper pod IPs and they can all reach each other? And service IPs work from within nodes? And you set up statis routes for your service IP range to one or more nodes?

Can you share your gcloud compute routes list ?

ashmere commented 8 years ago

Below is the routing info

NAME                                        NETWORK      DEST_RANGE       NEXT_HOP                                                 PRIORITY
default-route-8cb0d446ca80bae8              kube-lans   10.20.0.0/16                                                              1000
default-route-9bd7ab4d411d8997              kube-lans   0.0.0.0/0        default-internet-gateway                                 1000
kube-11e54845-5d7f-11e6-891a-42010a140003  kube-lans   10.120.1.0/24    europe-west1-d/instances/kube-core-0                    1000
kube-1250f76c-5d7f-11e6-9540-42010a140004  kube-lans   10.120.2.0/24    europe-west1-d/instances/kube-core-2                    1000
kube-13809ede-5d7f-11e6-b99a-42010a140002  kube-lans   10.120.3.0/24    europe-west1-d/instances/kube-core-1                    1000
kube-16d7e6f7-5d7f-11e6-9540-42010a140004  kube-lans   10.120.4.0/24    europe-west1-d/instances/kube-wk2-1                     1000
kube-1722998d-5d7f-11e6-9540-42010a140004  kube-lans   10.120.5.0/24    europe-west1-d/instances/kube-wk2-0                     1000
kube-56681c90-5d7f-11e6-9540-42010a140004  kube-lans   10.120.6.0/24    europe-west1-d/instances/kube-wk1-mb7r                  1000
kube-lans-service-ip-range-route1          kube-lans   10.220.254.0/24  europe-west1-d/instances/kube-core-0                    200
kube-lans-to-nonkube-1-route1              kube-lans   10.10.8.0/22   europe-west1/vpnTunnels/kube-to-nonkube-tunnel            100
thockin commented 8 years ago

I assume kube-lans-service-ip-range-route1 is the one you set up manually? 10.220.254.0/24 is your service range? did you create the VMs with "can-ip-forward" ? Did you check firewalls for the Service range?

On Mon, Aug 8, 2016 at 9:57 AM, Mat Davies notifications@github.com wrote:

Below is the routing info

NAME NETWORK DEST_RANGE NEXT_HOP PRIORITY default-route-8cb0d446ca80bae8 kube-lans 10.20.0.0/16 1000 default-route-9bd7ab4d411d8997 kube-lans 0.0.0.0/0 default-internet-gateway 1000 kube-11e54845-5d7f-11e6-891a-42010a140003 kube-lans 10.120.1.0/24 europe-west1-d/instances/kube-core-0 1000 kube-1250f76c-5d7f-11e6-9540-42010a140004 kube-lans 10.120.2.0/24 europe-west1-d/instances/kube-core-2 1000 kube-13809ede-5d7f-11e6-b99a-42010a140002 kube-lans 10.120.3.0/24 europe-west1-d/instances/kube-core-1 1000 kube-16d7e6f7-5d7f-11e6-9540-42010a140004 kube-lans 10.120.4.0/24 europe-west1-d/instances/kube-wk2-1 1000 kube-1722998d-5d7f-11e6-9540-42010a140004 kube-lans 10.120.5.0/24 europe-west1-d/instances/kube-wk2-0 1000 kube-56681c90-5d7f-11e6-9540-42010a140004 kube-lans 10.120.6.0/24 europe-west1-d/instances/kube-wk1-mb7r 1000 kube-lans-service-ip-range-route1 kube-lans 10.220.254.0/24 europe-west1-d/instances/kube-core-0 200 kube-lans-to-nonkube-1-route1 kube-lans 10.10.8.0/22 europe-west1/vpnTunnels/kube-to-nonkube-tunnel 100

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-238299871, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVOk7979VgfWdHVwVaeqNsLeIOPUGks5qd2ACgaJpZM4Iyhrg .

ashmere commented 8 years ago

Yes all VM created with can-ip-forward and yes that is the manual service range route.

Update: Also the firewalling is configured to allow and I pretty sure there is no problem with that as I do get some connections just not all as exampled in while loop above.

thockin commented 8 years ago

firewalls?

On Tue, Aug 9, 2016 at 1:33 AM, Mat Davies notifications@github.com wrote:

Yes all VM created with can-ip-forward and yes that is the manual service range route

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-238488379, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVJIY4FDrM2QQsF4kp6BRdV-AlsrQks5qeDtMgaJpZM4Iyhrg .

ashmere commented 8 years ago

only firewalls are gce which are configured to allow traffic and as you can see on the example some of the requests work. (if there was a firewall issue none of them would work)

kdima commented 8 years ago

Quick update we still get a lot of packet loss when accessing service ip. We are only using GCE networking. This issue still persists after we moved to 1.4.0-alpha.3. Here is the output or a curl when running 1.4.0-alpha.3

while true; do curl -k --max-time 5 https://<service_ip>/; sleep 2; done;                                                    
Unauthorized
curl: (28) Connection timed out after 5001 milliseconds
Unauthorized
Unauthorized
curl: (28) Connection timed out after 5001 milliseconds
Unauthorized

As you can see there is not issue with firewalling as some requests succeed.

We ran tcpdump on the box that is the gce route endpoint which is one of kubernetes api-servers. We can see all the traffic coming in:

source_ip.38926 > 10.220.254.1.https: Flags [S], seq 1700406661, win 28400, options [mss 1420,sackOK,TS val 2083749020 ecr 0,nop,wscale 7], length 0
source_ip.38926 > 10.20.0.4.sun-sr-https: Flags [S], seq 1700406661, win 28400, options [mss 1420,sackOK,TS val 2083749020 ecr 0,nop,wscale 7], length 0
source_ip.38926 > 10.220.254.1.https: Flags [S], seq 1700406661, win 28400, options [mss 1420,sackOK,TS val 2083750022 ecr 0,nop,wscale 7], length 0
source_ip.38926 > 10.20.0.4.sun-sr-https: Flags [S], seq 1700406661, win 28400, options [mss 1420,sackOK,TS val 2083750022 ecr 0,nop,wscale 7], length 0
source_ip.38926 > 10.220.254.1.https: Flags [S], seq 1700406661, win 28400, options [mss 1420,sackOK,TS val 2083752024 ecr 0,nop,wscale 7], length 0
source_ip.38926 > 10.20.0.4.sun-sr-https: Flags [S], seq 1700406661, win 28400, options [mss 1420,sackOK,TS val 2083752024 ecr 0,nop,wscale 7], length 0

10.20.0.4 is address of an other box that is hosting the kubernetes-API service.

We assume that the traffic get correctly routed to the "active" kubernetes api-server via kube-proxy but the reply is getting lost.

thockin commented 8 years ago

What filter is being applied on that tcpdump? There's obviously no replies in there, which would help a lot. Also, please run with -i any -n to get most useful output.

even better run tcpdump at both the originating node and the destination node.

kdima commented 8 years ago

Sorry about the delay. Here is the tcp dump of the the timeout:

I am running the following command on my local box:

$ curl -k --max-time 5 https://10.220.254.1/           
curl: (28) Connection timed out after 5000 milliseconds

tcpdump on my local box:

sudo tcpdump -i any -n host 10.220.254.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
IP 10.11.0.5.37273 > 10.220.254.1.443: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251689 ecr 0,nop,wscale 7], length 0
IP 10.11.0.5.37273 > 10.220.254.1.443: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251939 ecr 0,nop,wscale 7], length 0
IP 10.11.0.5.37273 > 10.220.254.1.443: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174252440 ecr 0,nop,wscale 7], length 0

Tcp dump on the remote box inside k8s cluster that is the gateway:

tcpdump -i any -n port 443
IP 172.29.12.5.37273 > 10.220.254.1.https: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251689 ecr 0,nop,wscale 7], length 0
IP 172.29.12.5.37273 > 10.220.254.1.https: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251939 ecr 0,nop,wscale 7], length 0
IP 172.29.12.5.37273 > 10.220.254.1.https: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174252440 ecr 0,nop,wscale 7], length 0

As you can see sequence numbers match on my local box and what was captured on the remote box. So the traffic is getting to the gateway box.

Here is a tcpdump of the case where traffic actually goes through: Command I ran on my local box:

curl -k --max-time 5 https://10.220.254.1/ 
Unauthorized

Local box tcpdump

sudo tcpdump -i any -n host 10.220.254.1
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [S], seq 3864019960, win 23200, options [mss 1160,sackOK,TS val 174372844 ecr 0,nop,wscale 7], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [S.], seq 3620472236, ack 3864019961, win 28160, options [mss 1160,sackOK,TS val 89257878 ecr 174372844,nop,wscale 7], length 0
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 0
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 1:274, ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 273
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], ack 274, win 229, options [nop,nop,TS val 89257888 ecr 174372847], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 1:55, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 54
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 55, win 182, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], seq 55:1203, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1148
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 1203, win 200, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], seq 1203:2351, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1148
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 2351, win 218, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], seq 2351:3499, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1148
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 3499, win 236, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 3499:4312, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 813
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 4312, win 253, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 4312:4650, ack 274, win 229, options [nop,nop,TS val 89257895 ecr 174372847], length 338
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 4650, win 271, options [nop,nop,TS val 174372851 ecr 89257895], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 4650:5008, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 358
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 5008, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5008:5017, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 9
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 274:412, ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 138
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5017:5023, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 6
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5023:5068, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 45
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 5068, win 289, options [nop,nop,TS val 174372854 ecr 89257906], length 0
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 412:517, ack 5068, win 289, options [nop,nop,TS val 174372854 ecr 89257906], length 105
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5068:5270, ack 517, win 237, options [nop,nop,TS val 89257916 ecr 174372854], length 202
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 517:548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 31
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [F.], seq 548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5270:5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 31
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [R], seq 3864020509, win 0, length 0
IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [F.], seq 5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 0
IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [R], seq 3864020509, win 0, length 0

Remote tcp dump

tcpdump -i any -n port 443
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [S], seq 3864019960, win 23200, options [mss 1160,sackOK,TS val 174372844 ecr 0,nop,wscale 7], length 0
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [S.], seq 3620472236, ack 3864019961, win 28160, options [mss 1420,sackOK,TS val 89257878 ecr 174372844,nop,wscale 7], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 1:274, ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 273
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [.], ack 274, win 229, options [nop,nop,TS val 89257888 ecr 174372847], length 0
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 1:55, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 54
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [.], seq 55:2351, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 2296
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 2351:4312, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1961
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 4312:4650, ack 274, win 229, options [nop,nop,TS val 89257895 ecr 174372847], length 338
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 4650:5008, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 358
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5008:5017, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 9
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 55, win 182, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 1203, win 200, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 2351, win 218, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 3499, win 236, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 4312, win 253, options [nop,nop,TS val 174372849 ecr 89257889], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 4650, win 271, options [nop,nop,TS val 174372851 ecr 89257895], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 5008, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 274:412, ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 138
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5017:5023, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 6
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5023:5068, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 45
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 412:517, ack 5068, win 289, options [nop,nop,TS val 174372854 ecr 89257906], length 105
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5068:5270, ack 517, win 237, options [nop,nop,TS val 89257916 ecr 174372854], length 202
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 517:548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 31
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [F.], seq 548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 0
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5270:5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 31
IP 10.220.254.1.https > 172.29.12.5.37567: Flags [F.], seq 5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [R], seq 3864020509, win 0, length 0
IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [R], seq 3864020509, win 0, length 0
IP 1.2.3.4.https > 10.20.0.3.34318: Flags [P.], seq 479531611:479531674, ack 276920723, win 237, options [nop,nop,TS val 1085792045 ecr 89019131], length 63
IP 1.2.3.4.https > 10.20.0.3.34318: Flags [F.], seq 63, ack 1, win 237, options [nop,nop,TS val 1085792045 ecr 89019131], length 0
IP 10.20.0.3.34318 > 1.2.3.4.https: Flags [P.], seq 1:32, ack 64, win 1228, options [nop,nop,TS val 89259139 ecr 1085792045], length 31
IP 10.20.0.3.34318 > 1.2.3.4.https: Flags [F.], seq 32, ack 64, win 1228, options [nop,nop,TS val 89259139 ecr 1085792045], length 0
IP 1.2.3.4.https > 10.20.0.3.34318: Flags [R], seq 479531675, win 0, length 0
IP 1.2.3.4.https > 10.20.0.3.34318: Flags [R], seq 479531675, win 0, length 0
thockin commented 8 years ago

Can you clarify who the actors are? I see

P 10.11.0.5.37567 > 10.220.254.1.443: Flags [S], seq 3864019960, win 23200,
options [mss 1160,sackOK,TS val 174372844 ecr 0,nop,wscale 7], length 0

I assume 10.11.0.5 is the pod, and 10.220.254.1 is the Service? If so, why does the receiving end see:

IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [S], seq 3864019960, win
23200, options [mss 1160,sackOK,TS val 174372844 ecr 0,nop,wscale 7],
length 0

First, who is 172.29.12.5 and why does he appear to be a NAT of 10.11.0.5 (same port)?

Second, why is 10.220.254.1 seen on the wire - if it is a Service IP, it should be NAT'ed client side (unless you are doing something very different than default).

Third, who is 1.2.3.4 in this dump?

On Wed, Sep 21, 2016 at 3:58 AM, Dmytro Kislov notifications@github.com wrote:

Sorry about the delay. Here is the tcp dump of the the timeout:

I am running the following command on my local box:

$ curl -k --max-time 5 https://10.220.254.1/ curl: (28) Connection timed out after 5000 milliseconds

tcpdump on my local box:

sudo tcpdump -i any -n host 10.220.254.1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes IP 10.11.0.5.37273 > 10.220.254.1.443: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251689 ecr 0,nop,wscale 7], length 0 IP 10.11.0.5.37273 > 10.220.254.1.443: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251939 ecr 0,nop,wscale 7], length 0 IP 10.11.0.5.37273 > 10.220.254.1.443: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174252440 ecr 0,nop,wscale 7], length 0

Tcp dump on the remote box inside k8s cluster that is the gateway:

tcpdump -i any -n port 443 IP 172.29.12.5.37273 > 10.220.254.1.https: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251689 ecr 0,nop,wscale 7], length 0 IP 172.29.12.5.37273 > 10.220.254.1.https: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174251939 ecr 0,nop,wscale 7], length 0 IP 172.29.12.5.37273 > 10.220.254.1.https: Flags [S], seq 4026671374, win 23200, options [mss 1160,sackOK,TS val 174252440 ecr 0,nop,wscale 7], length 0

As you can see sequence numbers match on my local box and what was captured on the remote box. So the traffic is getting to the gateway box.

Here is a tcpdump of the case where traffic actually goes through: Command I ran on my local box:

curl -k --max-time 5 https://10.220.254.1/ Unauthorized

Local box tcpdump

sudo tcpdump -i any -n host 10.220.254.1 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [S], seq 3864019960, win 23200, options [mss 1160,sackOK,TS val 174372844 ecr 0,nop,wscale 7], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [S.], seq 3620472236, ack 3864019961, win 28160, options [mss 1160,sackOK,TS val 89257878 ecr 174372844,nop,wscale 7], length 0 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 0 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 1:274, ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 273 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], ack 274, win 229, options [nop,nop,TS val 89257888 ecr 174372847], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 1:55, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 54 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 55, win 182, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], seq 55:1203, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1148 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 1203, win 200, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], seq 1203:2351, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1148 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 2351, win 218, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [.], seq 2351:3499, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1148 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 3499, win 236, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 3499:4312, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 813 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 4312, win 253, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 4312:4650, ack 274, win 229, options [nop,nop,TS val 89257895 ecr 174372847], length 338 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 4650, win 271, options [nop,nop,TS val 174372851 ecr 89257895], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 4650:5008, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 358 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 5008, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5008:5017, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 9 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 274:412, ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 138 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5017:5023, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 6 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5023:5068, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 45 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [.], ack 5068, win 289, options [nop,nop,TS val 174372854 ecr 89257906], length 0 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 412:517, ack 5068, win 289, options [nop,nop,TS val 174372854 ecr 89257906], length 105 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5068:5270, ack 517, win 237, options [nop,nop,TS val 89257916 ecr 174372854], length 202 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [P.], seq 517:548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 31 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [F.], seq 548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [P.], seq 5270:5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 31 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [R], seq 3864020509, win 0, length 0 IP 10.220.254.1.443 > 10.11.0.5.37567: Flags [F.], seq 5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 0 IP 10.11.0.5.37567 > 10.220.254.1.443: Flags [R], seq 3864020509, win 0, length 0

Remote tcp dump

tcpdump -i any -n port 443 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [S], seq 3864019960, win 23200, options [mss 1160,sackOK,TS val 174372844 ecr 0,nop,wscale 7], length 0 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [S.], seq 3620472236, ack 3864019961, win 28160, options [mss 1420,sackOK,TS val 89257878 ecr 174372844,nop,wscale 7], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 1:274, ack 1, win 182, options [nop,nop,TS val 174372847 ecr 89257878], length 273 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [.], ack 274, win 229, options [nop,nop,TS val 89257888 ecr 174372847], length 0 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 1:55, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 54 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [.], seq 55:2351, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 2296 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 2351:4312, ack 274, win 229, options [nop,nop,TS val 89257889 ecr 174372847], length 1961 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 4312:4650, ack 274, win 229, options [nop,nop,TS val 89257895 ecr 174372847], length 338 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 4650:5008, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 358 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5008:5017, ack 274, win 229, options [nop,nop,TS val 89257896 ecr 174372847], length 9 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 55, win 182, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 1203, win 200, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 2351, win 218, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 3499, win 236, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 4312, win 253, options [nop,nop,TS val 174372849 ecr 89257889], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 4650, win 271, options [nop,nop,TS val 174372851 ecr 89257895], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 5008, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [.], ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 274:412, ack 5017, win 289, options [nop,nop,TS val 174372851 ecr 89257896], length 138 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5017:5023, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 6 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5023:5068, ack 412, win 237, options [nop,nop,TS val 89257906 ecr 174372851], length 45 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 412:517, ack 5068, win 289, options [nop,nop,TS val 174372854 ecr 89257906], length 105 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5068:5270, ack 517, win 237, options [nop,nop,TS val 89257916 ecr 174372854], length 202 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [P.], seq 517:548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 31 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [F.], seq 548, ack 5270, win 307, options [nop,nop,TS val 174372856 ecr 89257916], length 0 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [P.], seq 5270:5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 31 IP 10.220.254.1.https > 172.29.12.5.37567: Flags [F.], seq 5301, ack 549, win 237, options [nop,nop,TS val 89257926 ecr 174372856], length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [R], seq 3864020509, win 0, length 0 IP 172.29.12.5.37567 > 10.220.254.1.https: Flags [R], seq 3864020509, win 0, length 0 IP 1.2.3.4.https > 10.20.0.3.34318: Flags [P.], seq 479531611:479531674, ack 276920723, win 237, options [nop,nop,TS val 1085792045 ecr 89019131], length 63 IP 1.2.3.4.https > 10.20.0.3.34318: Flags [F.], seq 63, ack 1, win 237, options [nop,nop,TS val 1085792045 ecr 89019131], length 0 IP 10.20.0.3.34318 > 1.2.3.4.https: Flags [P.], seq 1:32, ack 64, win 1228, options [nop,nop,TS val 89259139 ecr 1085792045], length 31 IP 10.20.0.3.34318 > 1.2.3.4.https: Flags [F.], seq 32, ack 64, win 1228, options [nop,nop,TS val 89259139 ecr 1085792045], length 0 IP 1.2.3.4.https > 10.20.0.3.34318: Flags [R], seq 479531675, win 0, length 0 IP 1.2.3.4.https > 10.20.0.3.34318: Flags [R], seq 479531675, win 0, length 0

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27161#issuecomment-248578128, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVG-HcQZCI42DbvXbEuq5zjM5lo6eks5qsQ3lgaJpZM4Iyhrg .

kdima commented 8 years ago

I have realised that this tcp dump is a bit over complicated as I was running it from my local machine and thus going through multiple tunnels. Here is a different version where there is only one tunnel involved. K8S api server service ip 10.220.254.1 VPN gateway is 10.20.0.3 this is where I am going to be running server side tcp dump. K8S api server endpoint 10.20.0.2:6443

K8S api server:

$kubectl describe svc kubernetes

Name:                   kubernetes
Namespace:              default
Labels:                 component=apiserver
                        provider=kubernetes
Selector:               <none>
Type:                   ClusterIP
IP:                     10.220.254.1
Port:                   https   443/TCP
Endpoints:              10.20.0.2:6443
Session Affinity:       ClientIP
No events.%     

Running curl from 172.29.12.6. It has a vpn tunnel to k8s cluster using 10.20.0.3 as the gateway.

curl -k --max-time 5 https://10.220.254.1
curl: (28) Connection timed out after 5000 milliseconds

I run tcp dump on 172.29.12.6

[root@172.29.12.6]# tcpdump -i any -n port 443
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode                                           
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes                                      
15:01:58.543754 IP 172.29.12.6.46476 > 10.220.254.1.https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226976533 ecr 0,nop,wscale 7], length 0                                                                 
15:01:59.546643 IP 172.29.12.6.46476 > 10.220.254.1.https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226977536 ecr 0,nop,wscale 7], length 0                                                                 
15:02:01.550667 IP 172.29.12.6.46476 > 10.220.254.1.https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226979540 ecr 0,nop,wscale 7], length 0        

Tcpdump on the gateway box 10.20.0.3

[root@10.20.0.3]#  tcpdump -i any -n host 172.29.12.6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
15:01:58.545530 IP 172.29.12.6.46476 > 10.220.254.1.https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226976533 ecr 0,nop,wscale 7], length 0
15:01:58.545583 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226976533 ecr 0,nop,wscale 7], length 0
15:01:59.547851 IP 172.29.12.6.46476 > 10.220.254.1.https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226977536 ecr 0,nop,wscale 7], length 0
15:01:59.547881 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226977536 ecr 0,nop,wscale 7], length 0
15:02:01.551497 IP 172.29.12.6.46476 > 10.220.254.1.https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226979540 ecr 0,nop,wscale 7], length 0
15:02:01.551528 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226979540 ecr 0,nop,wscale 7], length 0

Here a new ip appears the 10.20.0.2. This is the ip address of the pod where kubernetes api server currently is.

[root@10.20.0.2~]# tcpdump -i any -n host 172.29.12.6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
15:01:58.545963 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226976533 ecr 0,nop,wscale 7], length 0
15:01:58.546007 IP 10.20.0.2.sun-sr-https > 172.29.12.6.46476: Flags [S.], seq 2058743368, ack 2230218671, win 28160, options [mss 1420,sackOK,TS val 190681009 ecr 1226976533,nop,wscale 7], length 0
15:01:58.548666 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [R], seq 2230218671, win 0, length 0
15:01:59.547921 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226977536 ecr 0,nop,wscale 7], length 0
15:01:59.547984 IP 10.20.0.2.sun-sr-https > 172.29.12.6.46476: Flags [S.], seq 2074399197, ack 2230218671, win 28160, options [mss 1420,sackOK,TS val 190682011 ecr 1226977536,nop,wscale 7], length 0
15:01:59.550169 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [R], seq 2230218671, win 0, length 0
15:02:01.551531 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [S], seq 2230218670, win 28400, options [mss 1420,sackOK,TS val 1226979540 ecr 0,nop,wscale 7], length 0
15:02:01.551617 IP 10.20.0.2.sun-sr-https > 172.29.12.6.46476: Flags [S.], seq 2105705832, ack 2230218671, win 28160, options [mss 1420,sackOK,TS val 190684015 ecr 1226979540,nop,wscale 7], length 0
15:02:01.553719 IP 172.29.12.6.46476 > 10.20.0.2.sun-sr-https: Flags [R], seq 2230218671, win 0, length 0

It looks like what happens is traffic arrives from my ip 172.29.12.6 to the gateway on 10.20.0.3. From there it is redirected to the actual k8s api server endpoint that is on 10.20.0.2. When the packet gets to 10.20.0.2 it replies with Syn-Ack but the source ip is 10.20.0.2 at which point the connection is reset.