cockpit-project / cockpit

Cockpit is a web-based graphical interface for servers.
http://www.cockpit-project.org/
GNU Lesser General Public License v2.1
11.28k stars 1.12k forks source link

updates: Work around the "whilst offline" in Ubuntu & Debian #16963

Open garrett opened 2 years ago

garrett commented 2 years ago

The Updates page has an "emergent bug" between PackageKit, NetworkManager, and systemd-networkd.

As seen on

As our users will hit this bug on Ubutntu and Debian systems, we should provide a simple workaround.

This is not a bug in Cockpit, but this is a UX bug that affects Cockpit.

The solution is to create a fake network to make all of the subsystems happy:

nmcli con add type dummy con-name fake ifname fake0 ip4 1.2.3.4/24 gw4 1.2.3.1

We should add the workaround as a simple solution with a short description of the problem and the solution on the Software Updates page when this problem occurs.

If upstream fixes their systems to not trigger this issue, then it'll be transparently fixed, even with our workaround. Until then, a workaround like this will save a lot of people headaches.

garrett commented 2 years ago

We could have the workaround in Cockpit's Software Updates page where we augment the error to include the workaround, like this:

[Icon]

Network issues fetching updates

PackageKit believes your computer is offline. Either reload the page or apply a workaround(?) to make PackageKit communicate with NetworkManager.

[[Reload]] [Apply workaround]

The (?) would be a popover that says the following:

The workaround will add a fake network to your system using the following command:

nmcli con add type dummy con-name fake ifname fake0 ip4 1.2.3.4/24 gw4 1.2.3.1

Reload would be the default option; applying a workaround would be a secondary action.

barry-luijten commented 2 years ago

Hi, I ran into this issue on Ubuntu 20.04, but creating the fake connection did not resolve the issue. In order to expose VMs on my network, I have created a bridge device. Might this cause the issue?

# nmcli device
DEVICE  TYPE      STATE      CONNECTION
br0     bridge    unmanaged  --
eno1    ethernet  unmanaged  --
lo      loopback  unmanaged  --

# nmcli conn show
NAME  UUID                                  TYPE   DEVICE
fake  36b1c3c4-33df-47d4-bece-cc5366759997  dummy  --
kinghat commented 2 years ago

Hi, I ran into this issue on Ubuntu 20.04, but creating the fake connection did not resolve the issue. In order to expose VMs on my network, I have created a bridge device. Might this cause the issue?

# nmcli device
DEVICE  TYPE      STATE      CONNECTION
br0     bridge    unmanaged  --
eno1    ethernet  unmanaged  --
lo      loopback  unmanaged  --

# nmcli conn show
NAME  UUID                                  TYPE   DEVICE
fake  36b1c3c4-33df-47d4-bece-cc5366759997  dummy  --

same situation for me:

$ nmcli device
DEVICE           TYPE      STATE      CONNECTION 
br-18eb9ea2416d  bridge    unmanaged  --         
...    
br0              bridge    unmanaged  --         
docker0          bridge    unmanaged  --         
lxdbr0           bridge    unmanaged  --         
eno1             ethernet  unmanaged  --         
veth027f88b      ethernet  unmanaged  --         
... 
lo               loopback  unmanaged  --         
tailscale0       tun       unmanaged  --         
$ nmcli connection show
NAME  UUID                                  TYPE   DEVICE 
fake  44b60328-3595-4434-aeb8-8414cfda5aca  dummy  --     
garrett commented 2 years ago

@ItzMiracleOwO: BTW, I've opened up this more generic issue to better track this problem.

Others have tried the nmcli work-around with no success, but have more complicated networks. So it seems to work for some people, but not everyone.

garrett commented 2 years ago

@barry-luijten, @kinghat: Which version of Cockpit are you using? If you're not using backports to get Cockpit, it's a version that's 2 years old (version 215). It is my understanding that were some fixes in a more recent version which may be needed for the nmcli workaround to work.

Information on installing from backports:

(The proper fix, of course, is for Debian and Ubuntu to fix their PackageKit integration with NetworkManager and systemd-networkd. I think Ubuntu inherits this bug from Debian, but either could fix the integration in their own distro. Unfortunately, there's not much we can do about it being broken. Hopefully we can find a workaround meanwhile?)

kinghat commented 2 years ago

@barry-luijten, @kinghat: Which version of Cockpit are you using? If you're not using backports to get Cockpit, it's a version that's 2 years old (version 215). It is my understanding that were some fixes in a more recent version which may be needed for the nmcli workaround to work.

Information on installing from backports:

* on Ubuntu: https://cockpit-project.org/running.html#ubuntu

* on Debian: https://cockpit-project.org/running.html#debian

(The proper fix, of course, is for Debian and Ubuntu to fix their PackageKit integration with NetworkManager and systemd-networkd. I think Ubuntu inherits this bug from Debian, but either could fix the integration in their own distro. Unfortunately, there's not much we can do about it being broken. Hopefully we can find a workaround meanwhile?)

Cockpit is an interactive Linux server admin interface.
[Project website](https://cockpit-project.org/)

cockpit
    264-1~bpo20.04.1
cockpit-bridge
    264-1~bpo20.04.1
cockpit-machines
    260-1~bpo20.04.1
cockpit-networkmanager
    264-1~bpo20.04.1
cockpit-packagekit
    264-1~bpo20.04.1
cockpit-pcp
    215-1
cockpit-storaged
    264-1~bpo20.04.1
cockpit-system
    264-1~bpo20.04.1
cockpit-ws
    264-1~bpo20.04.1
garrett commented 2 years ago

BTW, here's the bug reported at Debian: https://bugs.kde.org/show_bug.cgi?id=407807 and here it is upstream at PackageKit: https://github.com/PackageKit/PackageKit/issues/336.

It looks like there's a PR that might fix the issue @ https://github.com/PackageKit/PackageKit/pull/506.

It's in PackageKit 1.2.5, which was released 13 days ago. Of course, it's not so likely this will go into an existing LTS distribution release like Debian Stable or Ubuntu LTS. But it's possible that the Stable/LTS versions might cherry-pick this patch.

However, it's marked as a critical bugfix @ https://bugs.launchpad.net/ubuntu/+source/packagekit/+bug/1961837

The supposed fix (I haven't tested it) is @ https://github.com/PackageKit/PackageKit/commit/f484808a1d0d14ef3aabcc7eb0a001f3e7830e7e.

It appears to add the following line to packagekit.service:

Wants=network-online.target

Please note that this bug can affect everything that uses PackageKit on Debian and Ubuntu — not just Cockpit. (Another example often given is KDE's Discover app.)

As it's pretty high-profile for anyone updating packages using PackageKit and it is a pretty simple fix (if that is the fix), then it's probably relatively easy to get the patch backported to Debian's and Ubuntu's stable versions of PackageKit. (Of course, this is way out of scope of Cockpit; that work would have to be done in the distros.)

kinghat commented 2 years ago

However, it's marked as a critical bugfix @ https://bugs.launchpad.net/ubuntu/+source/packagekit/+bug/1961837

i might be confused but it looks like its fixed in Jammy, the next ubuntu lts release 22.04 :thinking:

martinpitt commented 2 years ago

I just commented on that packagekit change. I don't see how this would work -- this is either a hacky workaround, or (more likely) a no-op. If this works, I'd be very interested to get some confirmation about that.

garrett commented 2 years ago

i might be confused but it looks like its fixed in Jammy, the next ubuntu lts release 22.04 thinking

Yeah, I meant for current Stable / LTS releases, not future ones.

I don't see how this would work -- this is either a hacky workaround, or (more likely) a no-op. If this works, I'd be very interested to get some confirmation about that.

Yes, it seems too simple to me too. I guess we'll see:? Hopefully it works (either with this patch or for other reasons). :four_leaf_clover:

divStar commented 2 years ago

So... Me experiencing the problem with Ubuntu 22.04LTS and PackageKit 1.2.5 means it's not fixed yet? Though as mentioned in the my bug report #17377 calling pkcon ... directly seems to always return the desired outcome (though admittedly I only tried a few calls so far).

I'd also like to add, that e.g. wanting to join a domain and needing to install realmd also results in a familiar error message ("Error: Cannot refresh cache whilst offline").

garrett commented 2 years ago

@divStar: Yes, it looks like Ubuntu (and Debian) still have the issue with their network stack and PackageKit. :disappointed:

The various workarounds listed above and in some of the linked pages are way too "hacky" to do something like this by default in Cockpit.

I don't know what a good resolution could be — this is a problem lower in the stack on parts that Cockpit depends upon, and it really needs to be fixed at that level. (However, it hasn't been, and this is causing a bad experience in Cockpit, and there's seemingly really nothing we can do about it, sadly.)

needing to install realmd also results in a familiar error message

Yep, that uses packagekit as well, so it's not much of a surprise.

But this does illustrate that the issue is quite a bit larger than just the updates page (and the hook that the overview page has into the software updates page which provides the update status in the overview page's health card).

divStar commented 2 years ago

@garrett I am still surprised though, because pkcon seems to return proper results. Isn't this - in one way or another - the program used to retrieve the package list?

martinpitt commented 2 years ago

@divStar : How exactly do you call pkcon? Can you try pkcon refresh force? That's using exactly the same API as Cockpit, so it would be a big surprise (and thus well worth looking into) if refresh force works, but the cockpit page fails. Thanks!

divStar commented 2 years ago

I was calling it from the CLI (ssh into the server). Sadly it didn't work, apparently for the same reason as Cockpit. But pkcon get-updates worked.

Here's the CLI output:

strange@kamar-taj:~$ sudo pkcon refresh force
[sudo] password for strange:
Refreshing cache              [=========================]
Loading cache                 [=========================]
Finished                      [=========================]
Fatal error: Cannot refresh cache whilst offline

I also tried it without sudo with the exact same result.

Edit: major breakthrough for me at least: disabling the NetworkManager while keeping systemd-networkd enabled and properly configured for my network interface and rebooting thereafter seems to have fixed it in that pkcon refresh force is now successful. I followed the hint in https://ubuntuforums.org/showthread.php?t=2463441 Edit2: Cockpit now does not report a problem anymore.

martinpitt commented 2 years ago

Right, get-updates only looks at the current local cache, that works fine. The "am I online" check is only done when PK needs to actualy download something. This is an impedance mismatch between PackageKit asking NetworkManager about online status, and NetworkManager not being responsible for the primary interface (as that is managed by netplan/networkd), and NM not recognizing the latter. IMHO it's a NM bug -- it should recognize that there is a default route, and be satisfied with that. The point is not that it's NM which manages that default route..

lfom commented 2 years ago

I have tried all the solutions from #8477, none worked. But the last comment pointed to the right direction. I installed Cockpit on a VPS that uses cloud-init, so all network interfaces are listed as "unmanaged" in nmcli, so I created a file:

$ cat /etc/NetworkManager/conf.d/10-globally-managed-devices.conf                                                                       
[keyfile]
unmanaged-devices=none

restarted NetworkManager and Cockpit worked fine. I checked all other files, and they were back to the initial config, and I also removed the "fake" network entry as well, so the only modification to the network settings was the one above.

RobertoMaurizzi commented 2 years ago

@lfom solution worked for me too: as others, it was a server with a bridged eth0 -> br0 so that the VMs can have real network addresses. Thanks! 👍

Ludo-code commented 1 year ago

I have the same problem... I have make one update from cockpit webinterface, and after this i have the same error and all workaround described her doesn't work...

apt update and apt upgrade doesn't work anymore...

garrett commented 1 year ago

@Ludo-code: You're going to have to provide some information for anyone to help you out. Which versions of everything are you using? (Which Linux distribution? Which version of it? Which versions of Cockpit packages? What did you try doing? Etc.)

We have a FAQ about this issue on Ubuntu @ https://cockpit-project.org/faq.html#error-message-about-being-offline

Make sure you're running a currently in support version of your OS. If you're using Ubuntu or Debian, it's probably a good idea to try version in backports instead, which is much more up to date than the default version Ubuntu and Debian ship.

Information about installing Cockpit using backports:

Ludo-code commented 1 year ago

Following our FAQ have solved the problem thanks

Dale-CDL commented 5 months ago

I am using the systemd network manager. Do you know if the suggested workaround using a fake network manager still works for me?

DmitryYalchik commented 3 months ago

Ubuntu 22.04.4 LTS Used sudo apt install -t jammy-backports cockpit

First time updating packages from Cockpit was okay but later i begun to getting "Cannot download packages whilst offline" errors