weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.61k stars 667 forks source link

weave-npc blocking connections in Kubernetes with no NetworkPolicies (2.6.0) #3761

Open chrisghill opened 4 years ago

chrisghill commented 4 years ago

What you expected to happen?

weave-npc to allow all network connections

What happened?

We upgraded weave in our Kubernetes cluster from version 2.4.1 to 2.6.0 ~2 weeks ago. We've been experiencing intermittent network issues since then. In the pod logs I've noticed multiple applications unable to communicate with other pods in the cluster. When I checked weave-npc logs, I noticed a ton of blocked connections:

WARN: 2020/01/27 18:45:42.924357 TCP connection from 172.20.34.171:38490 to 100.104.0.5:18000 blocked by Weave NPC.
WARN: 2020/01/27 18:45:42.924378 TCP connection from 172.20.47.38:47620 to 100.104.0.15:80 blocked by Weave NPC.
WARN: 2020/01/27 18:45:45.100358 TCP connection from 172.20.0.14:33848 to 100.117.0.5:8080 blocked by Weave NPC.
WARN: 2020/01/27 18:45:45.100381 TCP connection from 172.20.34.171:38510 to 100.104.0.5:18000 blocked by Weave NPC.
WARN: 2020/01/27 18:45:45.100390 TCP connection from 172.20.0.14:33848 to 100.117.0.5:8080 blocked by Weave NPC.
WARN: 2020/01/27 18:45:46.796350 TCP connection from 172.20.34.171:38510 to 100.104.0.5:18000 blocked by Weave NPC.
WARN: 2020/01/27 18:45:47.884357 TCP connection from 172.20.47.38:47620 to 100.104.0.15:80 blocked by Weave NPC.
WARN: 2020/01/27 18:45:47.884380 TCP connection from 172.20.0.14:33848 to 100.117.0.5:8080 blocked by Weave NPC.
WARN: 2020/01/27 18:45:47.884387 TCP connection from 172.20.34.171:38510 to 100.104.0.5:18000 blocked by Weave NPC.
WARN: 2020/01/27 18:45:49.260360 UDP connection from 100.97.0.8:38592 to 100.118.0.12:53 blocked by Weave NPC.
WARN: 2020/01/27 18:45:49.260381 UDP connection from 100.97.0.8:38592 to 100.118.0.12:53 blocked by Weave NPC.
WARN: 2020/01/27 18:45:49.260389 UDP connection from 100.97.0.7:42516 to 100.118.0.12:53 blocked by Weave NPC.
WARN: 2020/01/27 18:45:49.260395 UDP connection from 100.97.0.7:42516 to 100.118.0.12:53 blocked by Weave NPC.
WARN: 2020/01/27 18:45:49.260401 UDP connection from 100.97.0.4:52150 to 100.118.0.12:53 blocked by Weave NPC.

Note that we don't have any network policies in our cluster:

$ kubectl get networkpolicies --all-namespaces
No resources found.

How to reproduce it?

I'm unsure how to actually "trigger" the blocked connections. As soon as I saw the timeouts I downgraded weave to 2.4.1 in our cluster to get it back up and running. We don't observe this behavior in 2.4.1.

It should be noted that we went to 2.4.1 instead of 2.5.2 since we suffered from the default-deny of traffic until network policies are validated which was introduced in 2.5 (and discussed in this issue: https://github.com/weaveworks/weave/issues/3464).

Anything else we need to know?

Kubernetes v1.15.6 managed w/ Kops running in AWS

Versions:

$ weave version: 2.6.0
$ docker version
Client:
 Version:           18.06.3-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        d7080c1
 Built:             Wed Feb 20 02:28:26 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.3-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       d7080c1
  Built:            Wed Feb 20 02:26:51 2019
  OS/Arch:          linux/amd64
  Experimental:     false

$ uname -a
Linux ip-172-20-118-147 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux

$ kubectl version:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:20:18Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Not that I don't have Logs or IP tables output since I don't have any nodes running v2.6.0 currently since I've downgraded everything. If needed I can see about updating back to 2.6.0.

It should be noted that other than the numerous blocked connections I didn't see anything remarkable in the weave or weave-npc container logs. I did not inspect the IP tables.

murali-reddy commented 4 years ago

@chrisghill thanks for reporting the issue.

We've been experiencing intermittent network issues since then.

Could you please elobrate what symptoms you have noticied? Are you using default CIDR (10.32.0.0/12) for Weave or different CIDR? Reason I ask is from the weave-npc logs it seems connections are blocked are from hotsts in different CIDR's 172.20.34.171, 172.20.47.38, 100.97.0.8 etc.

Also in 2.6, rule that drops the packet is not added until atleast one network policy is applied so its unlikely packets are getting dropped when no network policies are is place.

I belive weave-npc could be misleading, it simply reading ulogd and logging the blocker connections.

If its possible for you, It would be helpful if you can share iptables and ipset dumps after updating to 2.6.

kostyrev commented 4 years ago

I've got the same problem.
Here are the logs

ubuntu@ip-10-21-157-154:~$ sudo docker ps | grep weave
986975948a24        174e0e8ef23d                                                             "/home/weave/launch.…"   12 days ago         Up 12 days                              k8s_weave_weave-net-qkcgt_kube-system_4721554a-4289-11ea-99e3-0a82cb5b9985_1
2bc401cb2118        weaveworks/weave-npc                                                     "/usr/bin/launch.sh"     12 days ago         Up 12 days                              k8s_weave-npc_weave-net-qkcgt_kube-system_4721554a-4289-11ea-99e3-0a82cb5b9985_0
bffeab27e086        k8s.gcr.io/pause-amd64:3.0                                               "/pause"                 12 days ago         Up 12 days                              k8s_POD_weave-net-qkcgt_kube-system_4721554a-4289-11ea-99e3-0a82cb5b9985_0
ubuntu@ip-10-21-157-154:~$ sudo docker exec -ti 986975948a24 /home/weave/weave --local status

        Version: 2.6.0 (up to date; next check at 2020/02/10 22:06:21)

        Service: router
       Protocol: weave 1..2
           Name: 0e:8b:bc:6c:af:28(ip-10-21-157-154.ec2.internal)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 26
    Connections: 27 (27 established)
          Peers: 28 (with 756 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12
root@ip-10-21-157-154:~# docker logs --tail 1000 2bc401cb2118 2>/tmp/logs
root@ip-10-21-157-154:~# grep blocked /tmp/logs | tail -10
WARN: 2020/02/10 20:35:11.722769 UDP connection from 10.40.128.4:37587 to 10.34.208.1:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722812 UDP connection from 10.40.128.4:37587 to 10.34.208.1:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722828 UDP connection from 10.40.128.4:57715 to 10.34.208.1:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722840 UDP connection from 10.40.128.4:57715 to 10.34.208.1:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722854 UDP connection from 10.40.128.4:37950 to 10.34.160.0:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722868 UDP connection from 10.40.128.4:37950 to 10.34.160.0:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722906 TCP connection from 10.40.128.4:40068 to 10.39.176.6:8080 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722920 TCP connection from 10.40.128.4:50472 to 10.40.208.14:8080 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722936 UDP connection from 10.40.128.4:55513 to 10.34.160.0:53 blocked by Weave NPC.
WARN: 2020/02/10 20:35:11.722949 UDP connection from 10.40.128.4:55513 to 10.34.160.0:53 blocked by Weave NPC.
root@ip-10-21-157-154:~# iptables-save | grep DROP
:KUBE-MARK-DROP - [0:0]
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A FORWARD -o weave -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
murali-reddy commented 4 years ago

@kostyrev please open a seperate bug with complete weave-npc logs, dump of iptables and ipset and details of any network policies applied

kostyrev commented 4 years ago

@murali-reddy thanks but I decided just to disable npc because I don't use network policies anyway.

just in case if anyone needs quick fix

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&disable-npc=true"
jim-barber-he commented 4 years ago

This is also an issue for us. I have a v1.17.6 Kubernetes cluster deployed via kops 1.17.0 that installs weave 2.6.2

root 7719 7645 0 Jun08 ? 00:06:24 /home/weave/weaver --port=6783 --datapath=datapath --mtu 8192 --name=7e:4c:54:dc:81:1b --host-root=/host --http-addr=127.0.0.1:6784 --metrics-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=100.96.0.0/11 --nickname=ip-10-8-35-150.ap-southeast-2.compute.internal --ipalloc-init consensus=30 --conn-limit=200 --no-masq-local 10.8.100.217 10.8.102.100 10.8.105.163 10.8.106.32 10.8.108.112 10.8.110.211 10.8.111.90 10.8.112.71 10.8.118.86 10.8.124.120 10.8.124.43 10.8.39.86 10.8.46.102 10.8.46.90 10.8.48.75 10.8.51.72 10.8.55.11 10.8.57.211 10.8.62.15 10.8.66.92 10.8.81.27 10.8.82.52 10.8.83.91 10.8.87.206 10.8.87.222 10.8.89.62 10.8.90.119 10.8.90.217 10.8.92.176 10.8.96.48


- Relevant parts from the kops network configuration for the cluster with some `<-` comments showing what uses the subnets.

additionalNetworkCIDRs:

Following are random examples of log entries showing blocked connections.

To pods/services in the same Kube cluster...

WARN: 2020/06/08 06:10:20.698359 TCP connection from 100.96.96.8:39446 to 100.116.224.20:8200 blocked by Weave NPC.
WARN: 2020/06/08 06:03:56.195224 TCP connection from 100.105.32.2:53362 to 100.106.0.0:9200 blocked by Weave NPC.

To nodes in the cluster...

WARN: 2020/06/08 06:10:20.698315 TCP connection from 100.96.96.9:44590 to 10.8.53.59:443 blocked by Weave NPC.

To AWS Redis, RDS, & Memcached within the VPC...

WARN: 2020/06/08 06:09:19.329505 TCP connection from 100.98.96.6:53104 to 10.0.1.16:6379 blocked by Weave NPC.
WARN: 2020/06/08 06:07:21.137738 TCP connection from 100.98.32.5:37366 to 10.0.11.188:5432 blocked by Weave NPC.
WARN: 2020/06/08 06:07:17.916156 TCP connection from 100.121.224.2:51984 to 10.0.11.99:11211 blocked by Weave NPC.

To a public address..

WARN: 2020/06/08 06:05:19.486866 TCP connection from 100.122.0.15:50504 to 3.105.216.37:443 blocked by Weave NPC.

Plenty of connection were working too, otherwise we would have had a full outage. These blocks appear to be random, and result in semi-regular timeouts in various parts of our applications.

This is a production cluster so I've removed the NPC container from the daemonset afterwards (and set EXPECT_NPC=0 on the weave container) since we have no network policies anyway, and this stops the traffic from being blocked.

bboreham commented 4 years ago

@jim-barber-he Are the connections really blocked, or you are just seeing log messages saying they are?
It could be a timing thing - the first packet can be dropped before netfilter is updated, but TCP will retry shortly afterward and nothing is really blocked.

jim-barber-he commented 4 years ago

We seem to have connection timeouts occurring in our jobs, so it looks like they are really blocked.

bboreham commented 4 years ago

@jim-barber-he please open a new issue and supply logs.

jim-barber-he commented 4 years ago

All I've got is what I supplied above. Production is the only cluster that gets enough traffic to see the problem, and we're not willing to run weave-npc on it because its causing issues with our production applications.

bcollard commented 4 years ago

Hello, same problem here.

k version --short
Client Version: v1.17.3
Server Version: v1.17.3

Logs:

weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.770207 UDP connection from 10.35.0.0:48243 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.770225 TCP connection from 10.35.0.0:56854 to 10.42.0.26:8080 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.770240 UDP connection from 10.35.0.2:50157 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.770252 UDP connection from 10.35.0.2:41309 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.827087 UDP connection from 10.35.0.2:44588 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.839587 UDP connection from 10.35.0.4:50070 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.839610 TCP connection from 10.35.0.1:41222 to 10.126.240.5:3128 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.839889 UDP connection from 10.35.0.4:47313 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.839909 UDP connection from 10.35.0.4:51281 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.839920 UDP connection from 10.35.0.4:49717 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.839933 TCP connection from 10.35.0.4:58080 to 10.126.240.5:3128 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900280 UDP connection from 10.35.0.2:53244 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900320 UDP connection from 10.35.0.2:39520 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900334 UDP connection from 10.35.0.2:34770 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900345 UDP connection from 10.35.0.2:42984 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900362 TCP connection from 10.35.0.2:49844 to 10.126.240.5:3128 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900373 UDP connection from 10.35.0.7:56910 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900383 UDP connection from 10.35.0.7:43250 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900397 UDP connection from 10.35.0.7:41626 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.900415 UDP connection from 10.35.0.7:39687 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951314 UDP connection from 10.35.0.7:53369 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951332 TCP connection from 10.35.0.7:45138 to 10.126.240.5:3128 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951343 UDP connection from 10.35.0.0:52342 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951355 UDP connection from 10.35.0.0:42649 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951365 UDP connection from 10.35.0.0:42744 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951386 UDP connection from 10.35.0.0:53798 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951405 TCP connection from 10.35.0.0:42222 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951437 UDP connection from 10.35.0.0:46769 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:11.951448 UDP connection from 10.35.0.0:49686 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035726 UDP connection from 10.35.0.0:38544 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035746 TCP connection from 10.35.0.0:42224 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035757 TCP connection from 10.35.0.0:42226 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035768 TCP connection from 10.35.0.0:42228 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035780 TCP connection from 10.35.0.0:42232 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035792 TCP connection from 10.35.0.0:42230 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035805 UDP connection from 10.35.0.0:42616 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035817 UDP connection from 10.35.0.0:56818 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:12.035828 UDP connection from 10.35.0.0:47116 to 10.32.0.3:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:13.038099 UDP connection from 10.35.0.7:57161 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:13.038112 UDP connection from 10.35.0.7:35883 to 10.32.0.5:53 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:13.038130 TCP connection from 10.35.0.7:36762 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-z9sqg weave-npc WARN: 2020/06/29 20:30:13.038151 TCP connection from 10.35.0.0:36168 to 50.17.235.25:443 blocked by Weave NPC
ks exec -it weave-net-npt9n -c weave -- /home/weave/weave --local status

        Version: 2.6.0 (failed to check latest version - see logs; next check at 2020/06/30 03:32:26)

        Service: router
       Protocol: weave 1..2
           Name: 44:1f:c6:a6:ed:b3(********)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 5
    Connections: 5 (5 established)
          Peers: 6 (with 30 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

Looks like connections (TCP && UDP) are randomly denied.

Please, tell me which command I can run to help in debugging this.

bcollard commented 4 years ago

I really like Weave products (net, flux, scope) but, for now, I think I will disable Weave NPC. That's a shame because I was setting up Network Policies. I guess I'll try to make it differently, with Istio.

Here are some more traces of random traffic blocked:

weave-net-g4dpt weave-npc WARN: 2020/07/06 16:10:06.742199 UDP connection from 10.244.32.5:47604 to 10.244.192.1:53 blocked by Weave NPC.
weave-net-sgcc9 weave-npc WARN: 2020/07/06 16:10:07.637841 TCP connection from 10.244.192.14:54080 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-sgcc9 weave-npc WARN: 2020/07/06 16:10:07.637899 UDP connection from 10.244.192.8:40649 to 10.244.32.1:53 blocked by Weave NPC.
weave-net-sgcc9 weave-npc WARN: 2020/07/06 16:10:07.637915 UDP connection from 10.244.192.8:47895 to 10.244.192.1:53 blocked by Weave NPC.
weave-net-sgcc9 weave-npc WARN: 2020/07/06 16:10:07.637938 UDP connection from 10.244.192.8:34244 to 10.244.32.1:53 blocked by Weave NPC.
weave-net-sgcc9 weave-npc WARN: 2020/07/06 16:10:07.637952 UDP connection from 10.244.192.8:33357 to 10.244.192.1:53 blocked by Weave NPC.
weave-net-sgcc9 weave-npc WARN: 2020/07/06 16:10:07.637963 TCP connection from 10.244.192.8:48736 to 10.40.0.10:5432 blocked by Weave NPC.
weave-net-g4dpt weave-npc WARN: 2020/07/06 16:10:07.737999 TCP connection from 10.244.32.5:42616 to 50.17.235.25:443 blocked by Weave NPC.

All of the netpol I've applied to my cluster have a port definition and none of the ports you can read in the traces above are part of it (53, 5432, 443). So I've no clue of how & why these connections are denied by Weave NPC.

FYI, today I bumped version from 2.6.0 to 2.6.5 with the same result :-/

Finally, I know that's an inappropriate comment - and I'm not sure of the following statement - but ... thinking again about this problem, I can remember when I had my CKA exam I have had some issues, randomly, while trying to resolve a service name from a busybox pod (port 53/UDP to coredns). After the exam, I could not find out why some of these resolve requests were not working. Now I wonder if that was related to traffic being sometimes denied by the weave-net CNI plugin. The point is that the raw output of this nslookup command is tested against the expected correct output. That was the only issue I had during the exam (passed with 96%, first attempt). I hope I'm wrong, but that's definitely NOT what you want to hear from the k8s users..

bboreham commented 4 years ago

Please open a new issue, follow the issue template to supply requested information, and do not truncate the logs.

Omniscience619 commented 3 years ago

Same issue here. At first, pods became unable to resolve hostnames for Consul pods. After restarting weave and making sure all services are running, some weren't able to connect to redis. Upon checking logs by using --tail=50, I came across the latest failure logs identical to @bcollard. I'm using the workaround suggested by @kostyrev now. Will reset the whole WeaveNet now.

1kaushik1 commented 3 years ago

We are experiencing the same issue for 2 of our clusters (though all our clusters have same configurations).

WARN: 2021/06/15 11:57:51.527538 UDP connection from xx.xx.xx.52:33703 to xx.xx.xx.212:8125 blocked by Weave NPC.
WARN: 2021/06/15 11:57:51.527556 UDP connection from xx.xx.xx.52:33703 to xx.xx.xx.212:8125 blocked by Weave NPC.
WARN: 2021/06/15 11:57:51.527570 UDP connection from xx.xx.xx.52:33703 to xx.xx.xx.212:8125 blocked by Weave NPC.
WARN: 2021/06/15 11:57:51.527579 UDP connection from xx.xx.xx.52:33703 to xx.xx.xx.212:8125 blocked by Weave NPC.

The ips in the logs are not associated to any pods. The only thing that stood out was the destination port - 8125 is a HostPort

name: monit-agent
    ports:
    - containerPort: 8125
      hostPort: 8125
      name: monitoringport

But we have implemented hostport capabilities to our CNI.

{
    "cniVersion": "0.3.0",
    "name": "weave",
    "plugins": [
        {
            "name": "weave",
            "type": "weave-net",
            "hairpinMode": true
        },
        {
            "type": "portmap",
            "capabilities": {"portMappings": true},
            "snat": true
        }
    ]
}

We are not seeing any impact because of the warn messages, but we are trying to understand the cause of these warning messages. And yes, we are not using network policies either.

Any help on this is appreciated :)

Yayg commented 1 year ago

Hello,

I'm running into the same issue on a production cluster with weaveNet.

We applied network policies in dev and worked well but when applying to our production cluster the network policies were not blocking egress connections at first but after a few trial and redeployments the pods switch to another node and then no way to get any packet going outwards.

We even removed the network policy but the pods are not able to communicate with pods even inside the same namespace.

Working with NPC being essentials for us it would mean a change of CNI if this is not working correctly :/

Is it related to https://github.com/weaveworks/weave/issues/3586 somehow?

Thanks,