openvswitch / ovs-issues

Issue tracker repo for Open vSwitch
10 stars 3 forks source link

OVS compatibility with its kernel module? (my kernel gets broken by systemd when restarting openvswitch????) #340

Closed legitYosal closed 1 month ago

legitYosal commented 1 month ago

I totally have read this doc: releases and Open vSwitch userspace is not sensitive to the Linux kernel version. It should build against almost any kernel, certainly against 2.6.32 and later.

But this doc seems a bit old, and I have faced a problem on my stage and production clusters, which was clearly started when tweaking with openvswitch but caused a disturbance in kernel!

I have tested on ubuntu 20 and 22, kernel 5.8 and 6.8, on ovn - ovs 22.03 - 2.17, 22.12 3.0.0 and 24.03 - 3.3.0, and docker latest versions, it seems the issue remains, this is how it will show:

  1. on an openstack compute node we have nova services, masakari services, ovn controller and openvswitch db and vswitchd and other services and almost all of them are dockerized with network mode host and attached to some volumes.
  2. on an OVN chassis start to repeatedly restart openvswitch_db and openvswitch_vswitchd(notice restarting can happen when deploying new versions), after some time(in 1 - 10 minutes) something weird happens.
  3. symptoms start with a raise in overhead in kernel function prepend_path, which can be seen with perf top
  4. symptoms continue with an error like there is no space on left device when after restarting the container it want to unmount containers volumes, it seems it is the cause(?) but it is not, this error can happen after overhead on prepend_path.
  5. almost starting any process gets too much slow, even opening an ssh session becomes impossible
  6. traced function calls on prepend_path using systemtap, to the systemd --user process(don't see the relation here) (and also a little bit from PID 1)
    $ stap -e 'probe kernel.function("prepend_path") { printf("Called by PID: %d   --- Exec: %s \n", pid(), execname()); }'
  7. only reboot can heal the OS after getting trapped in this state

Also have tested this on different versions and still happens, first I was testing for upgrade on stage and then find out on the current version on production we have the same issue, overhead while monitoring perf sometimes goes over 80% percent and it seems systemd --user is responsible for calling this function which causes systemd to be congested and unable to start new processes(?), also resources on hypervisor are not used at all!!! and restarting any other container multiple times will not cause such thing.

As I have no furthermore clue on what this is and why this is happening, I thought maybe it is because openvswitch compatibility with its kernel module,

  1. Is there any documentation on which versions of openvswitch are tested with which kernels?
  2. do you any idea why this would happen? and how to trace it more to the root cause?

@igsilya @ansisatteka

legitYosal commented 1 month ago

Ok, here we go,

yesterday I have traced the issue to system calls that were calling repeatedly in systemd, then traced it back to the file descriptor of /proc/*/mountinfo and then boom this file contained 30K lines of repeated mounted points which was not unmounted by docker, then I found the issue on the moby: https://github.com/moby/moby/issues/48305

I am closing this issue cause it is not a bug of OVS