Closed apyrgio closed 8 months ago
Looking more into it, we are now sure that when a container sends large amounts of data through its standard streams, faster than the client can consume, conmon
may receive EAGAIN
/ EWOULDBLOCK
. Due to a buggy implementation introduced in v2.0.23 and fixed in v2.0.26, this would make conmon
drop the connection (see containers/conmon#236).
The affected platforms and respective conmon
versions are:
conmon
v2.0.9, released in 2020-02-11)conmon
v2.0.25, released in 2021-01-21)For Ubuntu Focal, things are a bit more complicated. While conmon
is available through the Focal repo, Podman is not. Instead, we ask our uses to install Podman via a repo that OpenSUSE maintains (see https://github.com/freedomofpress/dangerzone/blob/main/INSTALL.md#ubuntu-debian), which installs conmon
v2.1.2. However, this particular combination is bitten by the timeout bug we alluded to at the start of the issue. Basically, if that conmon
version attempts to log data into the system's journal (see --log-driver
option), it will hang.
The timeout issue that affects Ubuntu Focal is easily bypassed. We can just use --log-driver none
when running Podman. This option exists in Docker as well, so it's worth using it in Windows and macOS too. After all, our standard streams will no longer have anything log-worthy.
The abrupt return issue though cannot be bypassed, to the best of my knowledge. The code path that erroneously sets the conmon
sockets to non-blocking is not affected by any meaningful option (see call path below):
The opt_bundle_path
that the code refers to is basically a container layer in the filesystem, so it's a critical option.
Therefore, we have two options:
conmon > 2.0.25
dependency in our .deb package. This will break existing Ubuntu Jammy / Debian Bullseye users, but we can ask them to install Podman through the OpenSUSE repo, same as we do for Ubuntu Focal.conmon
, either from Debian Bookworm, or from the OpenSUSE repo, into our FPF repo. This way, users will install a more recent conmon
version implicitly, without any interaction. On the flip side, we have to monitor the source repo for updates.Thanks a lot for this investigation and findings!
Backport conmon, either from Debian Bookworm, or from the OpenSUSE repo, into our FPF repo. This way, users will install a more recent conmon version implicitly, without any interaction. On the flip side, we have to monitor the source repo for updates.
This option sounds interesting. However, won't the podman be dependent on a specific conmon version already provided by the same repo? Because if this is the case we'd need to provide all the other dependencies in our repo such that our podman "take over" the user's distro one. However if the user uses podman for something else, this could lead to unintended consequences. And we'd also need to re-sign / rebuild these repos, which takes some maintenance work.
No, it's actually dependent on conmon (>= 2.0.18~)
as you can see from apt-cache show podman
. This means that a newer version tramps the one offered by the repos.
But yes, you're right about a) possibly unintended consequences, and b) resigning the packages. I don't like that as well.
There's another option that we can consider: Package conmon
v2.0.26 and include it in our Debian Bullseye / Ubuntu Jammy repo. This has the following benefits:
Sigh :face_exhaling:
While building the package, I just realized that this is a known bug in Debian Bullseye. The maintainers of the package are doing the same thing we are describing in this issue (https://github.com/freedomofpress/dangerzone/issues/685#issuecomment-1908007913). They have backported the fix from v2.0.26 into v2.0.25, and have created a new package called v2.0.25+ds1-1.1+deb11u1. This package has been accepted as a "proposed update" for Debian Bullseye in Sun, 12 Nov 2023.
The proposed package has not landed yet in Debian Bullseye, nor Ubuntu Jammy. We can build a Debian package out of it though and name it 2.0.25+ds1-1.1+deb11u1~fpf1. This way, I believe that the version order will be:
2.0.25+ds1-1.1 < 2.0.25+ds1-1.1+deb11u1~fpf1 < 2.0.25+ds1-1.1+deb11u1
where:
Edit: The original suggestion was to use Debian version 1.1+deb11u1
but this sorts higher than 1.1ubuntu<N>
, which is customary for Ubuntu. Since Debian Bullseye will have a patched conmon
package soon, we can focus on just the version for Ubuntu Jammy. We prefer 2.0.25+ds1-1.1a~fpf1, which sorts higher than the existing package, but lower than anything else:
2.0.25+ds1-1.1 < 2.0.25+ds1-1.1a~fpf1 < 2.0.25+ds1-1.1ubuntu<N> < 2.0.25+ds1-1.1+deb11u1
where:
Oh wow. What a timing... and this is an issue that has existed for more than a year. How convenient! I think you had a good hunch about that curiously named package in our call earlier today.
So this is actually good for us, no? We can use debian's package and just rename it (not sure if it uses the name inside the package or not) and re-sign it.
It adds a bit more confidence in what we're doing, so yeah, it's a good thing. I'll grab their source package, bump the changelog accordingly (the Debian version is taken from the changelog), and build it in an Ubuntu Jammy and Debian Bullseye container.
~Edit: they don't offer a .deb yet, if that's what you're asking @deeplow.~ Ouch, I stand corrected, see below.
I stand corrected: Debian has some repos called proposed-updates. For instance, there is an oldstable-proposed-updates
repo, which users can enable to install just conmon
:
/etc/apt/sources.list.d/oldstable-pu.list
deb http://deb.debian.org/debian/ oldstable-proposed-updates main
/etc/apt/preferences.d/oldstable-pu.pref
Package: *
Pin: release a=proposed-updates
Pin-Priority: 100
Package: conmon
Pin: release a=proposed-updates
Pin-Priority: 500
And if they run apt-cache policy conmon
, they will see that the candidate package to be installed is the new one (2.0.25+ds1-1.1+deb11u1
):
Installed: (none)
Candidate: 2.0.25+ds1-1.1+deb11u1
Version table:
2.0.25+ds1-1.1+deb11u1 500
100 http://deb.debian.org/debian oldstable-proposed-updates/main amd64 Packages
2.0.25+ds1-1.1 500
500 http://deb.debian.org/debian bullseye/main amd64 Packages
whereas for Podman, it's the old one, because we have lowered the priority of the proposed updates repo:
Installed: (none)
Candidate: 3.0.1+dfsg1-3+deb11u4
Version table:
3.0.1+dfsg1-3+deb11u5 100
100 http://deb.debian.org/debian oldstable-proposed-updates/main amd64 Packages
3.0.1+dfsg1-3+deb11u4 500
500 http://deb.debian.org/debian bullseye/main amd64 Packages
Nice. That means we can now see if our CI is fixed on https://github.com/freedomofpress/dangerzone/pull/627.
I've downloaded conmon
from the oldstable-proposed-updates
repo, and tested it in Ubuntu Jammy and Debian Bullseye. It seems to work!
I have a reservation regarding how this package was built. It was built in Debian Bullseye (proposed updates) which has:
libc6=2.31-13+deb11u7
libglib2.0-0=2.66.8-1+deb11u1
libsystemd0=247.3-7+deb11u4
We plan to install it in Ubuntu Jammy and Debian Bullseye though, which have:
Package | Jammy | Bullseye |
---|---|---|
libc6 | 2.35-0ubuntu3.6 | 2.31-13+deb11u7 |
libglib2.0-0 | 2.72.4-0ubuntu2.2 | 2.66.8-1 |
libsystemd0 | 249.11-0ubuntu3.12 | 247.3-7+deb11u4 |
As one can see, for Debian Bullseye, aside for a missing patch in libc6
/ libglib2.0-0
, there doesn't seem to be a compatibility issue. For Ubuntu Jammy, the library versions are higher than the ones the package was build for. For reference, this is the minimum version of the libraries that conmon
depends on:
Depends: libc6 (>= 2.17), libglib2.0-0 (>= 2.35.8), libsystemd0
Since libc6
and libglib2.0-0
are libraries that take backwards compatibility seriously, I believe that using a conmon
package that was linked with a previous version of these libraries should work. However, to be 100% safe, we can build a separate package, just for Ubuntu Jammy, and tag it as 2.0.25+ds1-1.1a~fpf1, for reasons I mentioned above.
@legoktm pointed out that the Debian 11.9 release will happen on February 10th, so it may not be strictly necessary to include the .deb
in our Debian Bullseye repo.
While working on issue #443 and PR #627, we have seen the following weird behavior (https://github.com/freedomofpress/dangerzone/pull/627#issuecomment-1892491968 and https://github.com/freedomofpress/dangerzone/pull/627#issuecomment-1892494187):
The platforms where this behavior is consistent are:
Specifically for the abrupt return bug, it seems that on the client side
read()
gets interrupted abruptly, and thus we raiseConverterProcException()
. Then we try to find the reason for the exception, and the exit code is actually0
! In that case, the Podman process does not exist, and the container is deleted as well.Interestingly, with
ps
we see that the underlying conversion process still runs. Withjournalctl -f
, we see that aconmon
component attempts to print binary data to the journal.