k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
26.82k stars 2.26k forks source link

Issue with built in load balancer #1216

Closed nemonik closed 3 years ago

nemonik commented 4 years ago

Version:

Versions since v1.0.0.

Describe the bug

With v0.9.x, I was able to spin up a server (whose IP is 192.168.0.11) on a vagrant (VM), spin up an app behind the built-on K3s LoadBalancer on the server node, add an agent (whose ip is 192.168.0.10) on another vagrant to form a cluster and then access the app from both the server node, the agent node and the host hosting the VirtualBox hypervisor on which the server and agent is run. Since v1.0.0, as soon as the agent is added, requests to an app orchestrated on server sent from the agent spikes to well over a minute resulting in clients timing out. If the app is exposed via a NodePort there is no problem ,but the higher port numbers present a problem to novice users.

My use case needs the apps to be accessible once ssh'ed into the Agent as I teach a hands-on DevOps class using K3s to host all the application in a two node cluster with apps initially being spun up on the server and then the agent being added. The class is taught in a resource constrained environment (laptops and desktop PCs) and so students also use the agent VM for development. This approach worked perfectly pre-v1.0.0 and now is broke upon the release of v1.0.0.

For example, when the students use the git command line client on the development vagrant (also an agent) request sent to GitLab hosted in the cluster will fail, because GitLab orchestrated by K3s is taking too long to respond. This is only happening on the agent. GitLab quickly responds to request from the host that is hosting VirtualBox and as does it to requests from the toolchain vagrant. Although, if you send the request to development vagrant ip and port from the agent vagrant, the application will quickly respond.

For example, it is taking over a minute for GitLab to respond from the development vagrant (the agent node) is 1 minute and 3.266 seconds as per

vagrant@development tmp]$ time wget http://192.168.0.11:10080
--2019-12-18 15:41:27--  http://192.168.0.11:10080/
Connecting to 192.168.0.11:10080... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://192.168.0.11:10080/users/sign_in [following]
--2019-12-18 15:42:31--  http://192.168.0.11:10080/users/sign_in
Reusing existing connection to 192.168.0.11:10080.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.8’
    [ <=>                                                                                       ] 9,344       --.-K/s   in 0s
2019-12-18 15:42:31 (193 MB/s) - ‘index.html.8’ saved [9344]
real    1m3.266s
user    0m0.000s
sys 0m0.012s

Whereas from the toolchain vagrant (also the server the node) response time is 0.103 seconds as per

[vagrant@toolchain tmp]$ time wget http://192.168.0.11:10080
--2019-12-18 15:45:11--  http://192.168.0.11:10080/
Connecting to 192.168.0.11:10080... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://192.168.0.11:10080/users/sign_in [following]
--2019-12-18 15:45:11--  http://192.168.0.11:10080/users/sign_in
Reusing existing connection to 192.168.0.11:10080.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’

    [ <=>                                                                                       ] 9,344       --.-K/s   in 0s      

2019-12-18 15:45:11 (161 MB/s) - ‘index.html’ saved [9344]

real 0m0.103s
user 0m0.006s
sys 0m0.016s

If you send the request to the development vagrant ip and port of the application, the response time matches the other favorable response times

[vagrant@development tmp]$ time wget http://192.168.0.10:10080
--2019-12-18 16:31:29--  http://192.168.0.10:10080/
Connecting to 192.168.0.10:10080... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://192.168.0.10:10080/users/sign_in [following]
--2019-12-18 16:31:29--  http://192.168.0.10:10080/users/sign_in
Reusing existing connection to 192.168.0.10:10080.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.9’

    [ <=>                                                                                       ] 9,344       --.-K/s   in 0s      

2019-12-18 16:31:29 (162 MB/s) - ‘index.html.9’ saved [9344]

real 0m0.096s
user 0m0.005s
sys 0m0.014s

If you pop onto the development vagrant as it is being provisioned and loop over a wget http://192.168.0.11:10080 to access GitLab orchestrated on the server node you will see the same quick response time up until the agent is added to the cluster and then it takes an in ordinate amount of time for GitLab to respond. If you expose GitLab via NodePort responses come quickly no matter where you are.

To Reproduce

  1. Spin up a K3s Server.

  2. Spin up GitLab like so on the server

---

apiVersion: v1
kind: Namespace
metadata:
  name: gitlab

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: redis-local-path-pvc
  namespace: gitlab
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 2Gi

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: gitlab
  labels:
    app: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: sameersbn/redis:4.0.9-1
        imagePullPolicy: IfNotPresent
        ports:
        - name: redis
          containerPort: 6379
        volumeMounts:
        - mountPath: /var/lib/redis
          name: data
        livenessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 5
          timeoutSeconds: 1
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: redis-local-path-pvc

---

apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: gitlab
spec:
  ports:
    - name: redis
      targetPort: redis
      port: 6379
  selector:
    app: redis

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgresql-local-path-pvc
  namespace: gitlab
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 2Gi

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgresql
  namespace: gitlab
  labels:
    app: postgresql
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgresql
  template:
    metadata:
      labels:
        app: postgresql
    spec:
      containers:
      - name: postgresql
        image: sameersbn/postgresql:10-2
        imagePullPolicy: IfNotPresent
        env:
        - name: DB_NAME
          value: gitlabhq_production
        - name: DB_USER
          value: gitlab
        - name: DB_PASS
          value: password
        - name: DB_EXTENSION
          value: pg_trgm
        ports:
        - name: postgres
          containerPort: 5432
        volumeMounts:
        - mountPath: /var/lib/postgresql
          name: data
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -h
            - localhost
            - -U
            - postgres
          initialDelaySeconds: 30
          timeoutSeconds: 5
        readinessProbe:
          exec:
            command:
            - pg_isready
            - -h
            - localhost
            - -U
            - postgres
          initialDelaySeconds: 5
          timeoutSeconds: 1
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: postgresql-local-path-pvc
---

apiVersion: v1
kind: Service
metadata:
  name: postgresql
  namespace: gitlab
spec:
  ports:
    - name: postgres
      port: 5432
      targetPort: postgres
  selector:
    app: postgresql

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: gitlab-local-path-pvc
  namespace: gitlab
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 2Gi

---

apiVersion: apps/v1
kind: Deployment  
metadata:
  name: gitlab
  namespace: gitlab
  labels:
    app: gitlab
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gitlab
  template:
    metadata:
      labels:
        app: gitlab
    spec:
      containers:
      - name: gitlab
        image: sameersbn/gitlab:12.5.2
        imagePullPolicy: IfNotPresent
        env:
        - name: GITLAB_HOST
          value: 192.168.0.11
        - name: GITLAB_PORT
          value: "10080"
        - name: GITLAB_SSH_PORT
          value: "10022"
        - name: GITLAB_SECRETS_DB_KEY_BASE
          value: long-and-random-alpha-numeric-string
        - name: GITLAB_SECRETS_SECRET_KEY_BASE
          value: long-and-random-alpha-numeric-string
        - name: GITLAB_SECRETS_OTP_KEY_BASE
          value: long-and-random-alpha-numeric-string
        - name: TZ
          value: Eastern Time (US & Canada)
        - name: GITLAB_TIMEZONE
          value: Eastern Time (US & Canada)
        - name: DB_ADAPTER
          value: postgresql
        - name: DB_ENCODING
          value: unicode
        - name: DB_HOST
          value: postgresql
        - name: DB_PORT
          value: "5432"
        - name: DB_NAME
          value: gitlabhq_production
        - name: DB_USER
          value: gitlab
        - name: DB_PASS
          value: password
        - name: REDIS_HOST
          value: redis
        - name: REDIS_PORT
          value: "6379"
        - name: GITLAB_ROOT_PASSWORD
          value: password
        ports:
        - name: http
          containerPort: 80
        - name: ssh
          containerPort: 22
        volumeMounts:
        - mountPath: /home/git/data
          name: data
        livenessProbe:
          httpGet:
            path: /favicon.ico
            port: 80
          initialDelaySeconds: 180
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /favicon.ico
            port: 80
          initialDelaySeconds: 5
          timeoutSeconds: 1
          failureThreshold: 12
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: gitlab-local-path-pvc

---

apiVersion: v1
kind: Service
metadata:
  name: gitlab
  namespace: gitlab
spec:
  ports:
    - name: http
      targetPort: http
      port: 10080
    - name: ssh
      targetPort: ssh
      port: 10022
  selector:
    app: gitlab
  type: LoadBalancer
  externalTrafficPolicy: Cluster
  1. On your host or the server node execute time wget http://<ip of server>:10080 in my example the ip of the server is 192.168.0.11 and note how quick GitHub responds.
  2. Spin up an agent vm or vagrant.
  3. Execute time wget http://<ip of server>:10080 on the agent and note how quick GitLab responds.
  4. Add the agent to the cluster.
  5. Execute time wget http://<ip of server>:10080 on the agent and note how quick GitLab responds.

The time it takes to receive a response between steps 3 and 5 in comparison to step 7 should be very different. 3 and 5 are wicked fast. 7 is wicked slow.

Expected behavior

Apps hosted on the cluster behind the K3s built-in LoadBalancer should respond in the same fashion no matter where you access them from.

Actual behavior

If you spin up a server, spin up an app behind the built-on K3s LoadBalancer on the server node, add an agent to form a cluster and then attempt to access the app from agent the app is taking too long to respond in comparison response times on the host hosting the the two VMs (server and agent) and from the server.

Additional context

My course material is here https://github.com/nemonik/hands-on-DevOps

erikwilson commented 4 years ago

Thanks for the details @nemonik. Are you launching k3s with the --flannel-iface flag by chance? With Vagrant I think there is a secondary interface that is typically used for for inter-node communication. For the Vagrantfile (w/Alpine) in this repo I typically launch with --flannel-iface=eth1 for multi-node configurations, for other operating systems the interface will probably be different.

nemonik commented 4 years ago

Thank you @erikwilson.

Yes, I do launch via the --flannel-iface flag like so

https://github.com/nemonik/hands-on-DevOps/blob/master/ansible/roles/k3s-server/tasks/main.yml#L84

 - name: install K3s
    become: yes
    shell: INSTALL_K3S_VERSION={{ k3s_version }} INSTALL_K3S_EXEC="--flannel-iface={{ k3s_flannel_iface }} --cluster-secret={{ k3s_c
luster_secret }} --docker --no-deploy traefik" /home/{{ ansible_user_id }}/k3s/install_k3s.sh 

Where k3s_flannel_iface is defined as eth1 as per

https://github.com/nemonik/hands-on-DevOps/blob/master/ansible_extra_vars.rb#L26

On the agent ip addr returns the following interfaces

[vagrant@development ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:8a:fe:e6 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 73531sec preferred_lft 73531sec
    inet6 fe80::5054:ff:fe8a:fee6/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:38:45:bd brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.10/24 brd 192.168.0.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe38:45bd/64 scope link
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:11:d7:9e:44 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 0e:fa:ed:6d:70:73 brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.0/32 brd 10.42.1.0 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::cfa:edff:fe6d:7073/64 scope link
       valid_lft forever preferred_lft forever
6: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 72:ae:ba:59:bf:b9 brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.1/24 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::70ae:baff:fe59:bfb9/64 scope link
       valid_lft forever preferred_lft forever
8: veth7164902d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether 16:5f:20:a4:22:4f brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::145f:20ff:fea4:224f/64 scope link
       valid_lft forever preferred_lft forever
9: veth6281ef87@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether 46:bd:4e:12:22:5e brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::44bd:4eff:fe12:225e/64 scope link
       valid_lft forever preferred_lft forever
11: vethd5f7b5b8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether c2:a5:18:7e:b2:3b brd ff:ff:ff:ff:ff:ff link-netnsid 4
    inet6 fe80::c0a5:18ff:fe7e:b23b/64 scope link
       valid_lft forever preferred_lft forever
12: vethef3d9b3e@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether ae:9d:c3:6a:6e:61 brd ff:ff:ff:ff:ff:ff link-netnsid 5
    inet6 fe80::ac9d:c3ff:fe6a:6e61/64 scope link
       valid_lft forever preferred_lft forever
13: veth94f60f08@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether e6:0f:47:c8:ac:60 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::e40f:47ff:fec8:ac60/64 scope link
       valid_lft forever preferred_lft forever
14: veth990e47aa@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether 0a:46:48:1d:a9:31 brd ff:ff:ff:ff:ff:ff link-netnsid 6
    inet6 fe80::846:48ff:fe1d:a931/64 scope link
       valid_lft forever preferred_lft forever
15: vethcc98fdb2@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether a6:b2:d2:f0:10:03 brd ff:ff:ff:ff:ff:ff link-netnsid 3
    inet6 fe80::a4b2:d2ff:fef0:1003/64 scope link
       valid_lft forever preferred_lft forever

the networking between the vagrants is define as the following in the Vagrantfile

https://github.com/nemonik/hands-on-DevOps/blob/master/Vagrantfile#L159

  config.vm.define 'toolchain' do |toolchain|
    toolchain.vm.box = 'nemonik/devops'
    toolchain.vm.network :private_network, ip: '192.168.0.11'
    toolchain.vm.hostname = 'toolchain'
    toolchain.vm.synced_folder '.', '/vagrant', type: 'virtualbox', owner: "vagrant", group: "vagrant", mount_options: ["dmode=775,fmode=664"]
    toolchain.vm.provider :virtualbox do |virtualbox|
      virtualbox.name = 'DevOps Class - toolchain'
      virtualbox.customize ['guestproperty', 'set', :id, '/VirtualBox/GuestAdd/VBoxService/--timesync-set-threshold', 10]
      virtualbox.memory = 8192 #6144 #4096
      virtualbox.cpus = 8 #4
      virtualbox.gui = false
    end

and

https://github.com/nemonik/hands-on-DevOps/blob/master/Vagrantfile#L183

  ## Provision development vagrant
  config.vm.define "development" do |development|
    development.vm.box = 'nemonik/devops'
    development.vm.network :private_network, ip: '192.168.0.10'
    development.vm.hostname = 'development'
    development.vm.synced_folder '.', '/vagrant', type: 'virtualbox', owner: "vagrant", group: "vagrant", mount_options: ["dmode=775,fmode=664"]
    development.vm.provider :virtualbox do |virtualbox|
      virtualbox.name = 'DevOps Class - development'
      virtualbox.customize ['guestproperty', 'set', :id, '/VirtualBox/GuestAdd/VBoxService/--timesync-set-threshold', 10]
      virtualbox.memory = 2048
      virtualbox.cpus = 2
      virtualbox.gui = false
    end
nemonik commented 4 years ago

I did try to resolve the problem by separating an app's LoadBalancer service resource out into its own resource file, so that after the agent is added I can delete for example GitLab's service of type LoadBalancer and then re-apply in the hopes the LoadBalancer performs as it did with v0.9.x, but after doing so this doesn't solve the problem like so

--
- name: restart each app's loadbalancer
  shell: ssh -oStrictHostKeyChecking=no -i /home/{{ ansible_user_id }}/.ssh/id_rsa {{ ansible_user_id }}@{{ hostvars['toolchain']['ansible_host'] }} "KUBECONFIG=/home/{{ ansible_user_id }}/kubeconfig.yml kubectl delete -f /home/{{ ansible_user_id }}/{{ item }} && KUBECONFIG=/home/{{ ansible_user_id }}/kubeconfig.yml kubectl apply -f /home/{{ ansible_user_id }}/{{ item }}"
  with_items:
    - "docker-registry/registry_service.yml"
    - "gitlab/gitlab_service.yml"
    - "plantuml-server/plantuml-server_service.yml"
    - "sonarqube/sonarqube_service.yml"
    - "taiga/taiga_service.yml"

Failing this being a regression in K3s between v0.9.x and v1.0.x (as the same problem exists in the next release candidate of K3s) I was going to try spinning up the cluster first and then deploy the apps vice spinning up the server, deploying the apps and adding an agent. If this works this will lead to me refactoring the course automation.

erikwilson commented 4 years ago

Thanks for the details and help with debugging. I would like to see a very minimal example of the issue, maybe using the built-in CoreDNS service. It is good to know that you are using the --docker flag, does it happen if you use the built-in containerd instead? What OS are you using? Does the issue happen with v0.10.x?

nemonik commented 4 years ago

I am using the built in CoreDNS. I'll try weaning the project down to a minimal project.

Is using Ansible to configure okay? It would allow me to re-use my roles paired down quite a bit of course.

I will also try using the built in containerd following using docker... I selected docker, so that students were not confronted with two container runtimes and had a full view into what containers were running and with tools they had greater chance of being remotely familiar with. The project's choice of supporting both runtimes is swell.

I am using CentOS 7.

The Vagrantfile not finding a Vagrant base box of nemonik/devops will build the base box as per

https://github.com/nemonik/hands-on-DevOps/tree/master/box

after retrieving centos/7 basebox and then installing some common packages, docker, configuring it, docker-compose and a few other tools.

nemonik commented 4 years ago

Let me try v0.10.0... I jumped from v0.9.1 to v1.0.0. That would be an easy try... Lemme try that first and get back to you.

nemonik commented 4 years ago

I'll start with v0.10.2 and walk my way back to v0.9.1 until the problem goes away.

erikwilson commented 4 years ago

Thanks for the info! Was trying to parse through the repo but figured it easier to just ask some questions. The usual suspects for me might be the flags, network config, iptables (as you prob know should be legacy), or even kernel (might be worth trying Ubuntu). Sorry I don't have a better answer, thanks very much for helping to debug this, would very much like to get to the root cause and find a fix.

nemonik commented 4 years ago

i weaned things down to a simple project, but the problem wasn't there in the first run, but i did change a few things in the process... so, i'm including these changes back into the full project to see if what I changed actually addressed the problem or if this was a fluke.

nemonik commented 4 years ago

So, my success appears to have either been a brain fart or a one time fluke.

The load balancer is really slow... on the agent. Here is a minimal example as you requested

https://github.com/nemonik/k3s-issue-1216

Run

./test.sh

You will need vagrant and virtualbox installed. Tested on OS X, but should work on Windows and Linux.

nemonik commented 4 years ago

All pushed... Please, let me know if you need help. Why centos? Why not alpine or ubuntu? Well, CentOS tracks RHEL. And for RHEL you have STIGs ("Security Technical Implementation Guides"). For enterprise and government, the pedigree of RHEL and therefor CentOS as it tracks RHEL is far higher and better received over others that have no STIGs or only emergent STIGs.

nemonik commented 4 years ago

I also included a role to configure GitLab... But the same problem can be demonstrated via simple hello-world web app container. Again, the problem is from the agent if you send a request to the app through the server ip the response is taking way more time than it should, vice when sending the request from the server or the host underlying the VMs. Also, if you send the request to the agent when on the agent the response is immediate.

nemonik commented 4 years ago

If you switch to deploy_app_from_agent branch you can see the same exact thing happens when you deploy the httpd app from the agent. The httpd app will not respond in a timely fashion (it will but it will take some time) to request sent from the agent, but requests sent from the host underlying the VMs and the server, so the httpd ansible role's task

  - name: wait for httpd to be available
    uri:
      url: http://{{ hostvars['server']['ansible_host'] }}
      status_code: 200
    register: result
    until: result.status == 200
    retries: 60
    delay: 5

will ultimately will fail, when executed on the agent.

erikwilson commented 4 years ago

Thanks for the info. There is still too much going on here, I shouldn't need to checkout a repo to understand the problem. What are the server args and what are the agent args? I wasn't asking for a justification for CentOS, just for more data-points. Did you try it with containerd?

nemonik commented 4 years ago

Oh, gosh. Knowing that would of save me some time.

Agent Args:

INSTALL_K3S_VERSION=v1.0.1-rc1 K3S_URL=https://192.168.0.11:6443 K3S_CLUSTER_SECRET=kluster_secret INSTALL_K3S_EXEC="--flannel-iface=eth1  --docker" /home/vagrant/k3s/install_k3s.sh agent

Server Args:

INSTALL_K3S_VERSION=v1.0.1-rc1 INSTALL_K3S_EXEC="--flannel-iface=eth1 --cluster-secret=kluster_secret --docker --no-deploy traefik" /home/vagrant/k3s/install_k3s.sh

I can certainly run the two roles without the --docker flag, but have not. I will later this evening when my bandwidth at home is uncapped and come back here.

Understood. All I was saying is Ubuntu and Alpine are not possible for the reasons I gave.

erikwilson commented 4 years ago

Thank you much for the example and all of the details. Apologies, there is just limited bandwidth for me or a QA person to understand what is happening with the repo, and as you can imagine maybe not wanting to blindly run a script unless it is in a throw-away environment.

I am not wanting for you to change your OS for the lab, I just want to know if the issue exists for you in other environments.

Have you looked into iptables at all? Is it possible that a firewall is causing the problem? If a firewall is running would be worth temporarily disabling it on all nodes to see if it alleviates the issue.

nemonik commented 4 years ago

CentOS ships with a firewall called firewalld and it is disabled.

[vagrant@server ~]$ systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

I'll given containerd this evening.

nemonik commented 4 years ago

Running with containerd on both server and agent returns the same result.

from localhost:

--2019-12-22 01:20:41--  http://192.168.0.11/
Connecting to 192.168.0.11:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 112 [text/html]
Saving to: ‘index.html.2’

index.html.2                                                                    100%[=====================================================================================================================================================================================================>]     112  --.-KB/s    in 0s

2019-12-22 01:20:41 (10.7 MB/s) - ‘index.html.2’ saved [112/112]

real    0m0.013s
user    0m0.003s
sys 0m0.005s

from server:

--2019-12-22 06:20:45--  http://192.168.0.11/
Connecting to 192.168.0.11:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 112 [text/html]
Saving to: ‘index.html’

100%[====================================================================================================================================================================================================================================================================================>] 112         --.-K/s   in 0s

2019-12-22 06:20:45 (7.61 MB/s) - ‘index.html’ saved [112/112]

real    0m0.021s
user    0m0.003s
sys 0m0.006s
Connection to 127.0.0.1 closed.

from agent:

--2019-12-22 06:20:49--  http://192.168.0.11/
Connecting to 192.168.0.11:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 112 [text/html]
Saving to: ‘index.html’

100%[====================================================================================================================================================================================================================================================================================>] 112         --.-K/s   in 0s

2019-12-22 06:21:52 (7.29 MB/s) - ‘index.html’ saved [112/112]

real    1m3.131s
user    0m0.001s
sys 0m0.015s
Connection to 127.0.0.1 closed.
nemonik commented 4 years ago

So, ignoring for the fact I can easily ssh from the agent into the server with no delay, if I spin up the same httpd container on port 8080 via docker that I'm spinning up via kubectl behind the k3s loadbalancer I find that I can access this container from the host, the server, and the agent with no delay.

nemonik commented 4 years ago

I disabled K3s LoadBalancer and deployed the latest MetalLB with the following configuration

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 192.168.0.200-192.168.0.250

Redeployed the httpd container using the MetalLB and there is no delay hitting the httpd container from the host, the server/master, or the agent/node. I think there is a problem in K3s LoadBalancer.

nemonik commented 4 years ago

I've verified this seems to be an issue limited to CentOS 7... The issue doesn't appear on vagrant's running Alpine.

BlackTurtle123 commented 4 years ago

I do have the same issue So soon I add a second node to my system, all works fine.. Just not through load balancer. I run on debian 10.2, latest k3s version and with traefik default enabled

nemonik commented 4 years ago

I'm not alone. It is nice to be not alone. :)

rafaribe commented 4 years ago

Also on the same boat @nemonik

carlosedp commented 4 years ago

Same behaviour here in Debian 10 running on VirtualBox with two interfaces where the one that should have the services exposed is eth1.

carlosedp commented 4 years ago

Apparently it looks like something related to Flannel because my deployment of Prometheus cannot scrape data from pods with services exposed in cluster IPs.

Also my ingress routes doesn't work externally.

I've started K3s with --flannel-iface=eth1 as it's my external interface.

parekhha commented 4 years ago

I have 3 master node with Centos7. If I try to access POD running on different host using "ServiceIP" it is taking time, however if I use "PodIP" there is no delay. Due to this, other 2 master node is giving timeout error while connecting to metrics-server.

Is there any workaround for this on Centos 7 ?

seanmwinn commented 4 years ago

I'm not even using flannel and experience the exact same results - with Cilium. I can easily replace the built-in service load balancer with metal-lb and everything works flawlessly. I'm using VirtualBox, my base boxes are ubuntu 19.10 with kernel 5.3.

nemonik commented 4 years ago

I did the same earlier replacing with metallb to get around the problem.

On Tue, Mar 31, 2020, 2:13 PM Sean Winn notifications@github.com wrote:

I'm not even using flannel and experience the exact same results - with Cilium. I can easily replace the built-in service load balancer with metal-lb and everything works flawlessly.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rancher/k3s/issues/1216#issuecomment-606789100, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLAGGJWLW3IJNQPXPNHE3RKIXFPANCNFSM4J4N4G3Q .

nemonik commented 4 years ago

I think running

sudo ethtool -K flannel.1 tx-checksum-ip-generic off

on each worker node will temporarily resolve this issue, but only temporarily as it won't last through a reboot of a node.

See:

https://github.com/coreos/flannel/issues/1243 https://github.com/rancher/k3s/issues/1638

So, I got around the issue by setting INSTALL_K3S_EXEC="--flannel-backend=host-gw to use host-gw vice vxlan impacted by the kernel issue. K3s will use vxlan via default.

nemonik commented 4 years ago

the issue can be traced back to https://github.com/kubernetes/kubernetes/issues/88986

nemonik commented 4 years ago

the issue can be traced back to kubernetes/kubernetes#88986

a patch was made to flannel to address https://github.com/coreos/flannel/pull/1282

erikwilson commented 4 years ago

@nemonik many thanks for keeping this issue updated, and digging in to find a solution.

It sounds like this will be fixed in the CentOS kernel, but we should disable the checksum anyways?

Am guessing we will want to add ethtool to k3s-root and call it during flannel setup, but it might be sufficient to cherry-pick or use a variant of that patch if it accepted.

nemonik commented 4 years ago

You are welcome. I would disable... CentOS kernel updates are very infrequent.

nemonik commented 4 years ago

Hopefully the flannel patch will be merged.

janeczku commented 3 years ago

The fix for the underlying upstream issue landed in K8s v1.19 and was cherry-picked in 1.18.6. Should be good to re-test / close. https://github.com/kubernetes/kubernetes/pull/92256

brandond commented 3 years ago

Correct - this should be fixed on both release and master branches.