Closed w4tsn closed 3 years ago
OK, after working with this a bit more I think the problem is actually caused by my image creation process. When preparing the OS image I forgot several SELinux labels on /sysroot
and especially /sysroot/ostree
which means that all files and folders had effectively root_t
as their type. After setting the labels as it is done in the Fedora IoT image, I now get a streight "Error: Permission denied" and an AVC denial on a lock create op.
AVC avc: denied { create } for pid=2568 comm="rpm-ostree" name="lock" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:usr_t:s0 tclass=file permissive=0
That's as far as I got. I'm now trying to figure out where I forgot to set labels. Unfortunately setfiles
reports that there are no default labels for /ostree
so I can't just do restorecon
.
Is there any documentation on how the labels should look like?
Apart from that I suppose that this is solely my personal problem. The only thing I could take away from this for rpm-ostree might be that the error messages are very misleading and lack a certain expressiveness. But then again it's debatable if a screwed up system, like mine apparently is, should be considered in error handling and reporting anyway.
EDIT:
After setting SELinux to permissive mode the SSL certificate error returned, so I suspect I just screwed up the file contexts even more and since I did not reboot in the meantime maybe processes had started / run with wrong process contexts or something. So I'm eventually back at the problem that the mTLS remote does not work in rpm-ostree while it does in ostree.
EDIT 1:
I was able to verify that the mTLS remote / configuration works on an older system with rpm-ostree 2020.5
. This system started as Fedora IoT and was rebased onto my custom OSTree. Next I'll flash the latest Fedora IoT and try to use the mTLS remote to rebase onto my custom OSTree. If this is indeed rpm-ostree related in any way it should fail.
EDIT 2:
So I now setup a stock Fedora IoT 33 raw image based install and added my mTLS remote to it. rpm-ostree rebase
on that remote always returns a error: Problem with the local SSL certificate
while this works at least with rpm-ostree 2020.05. Also ostree itself has no problem interacting with that remote. I'm able to do a successful rpm-ostree rebase
if I "manually" pull the refs with ostree pull
first. I'm now pretty sure that this is a bug in some way or another in rpm-ostree introduced somewhere after 2020.05.
This part of the code is pure ostree; rpm-ostree basically defers to libostree for all HTTP requests there. It could be a regression there, but it's much more likely IMO to be libcurl related; or possibly openssl. I'd try using e.g. rpm-ostree usroverlay
combined with directly rpm -ivh --force
on different libcurl builds.
Have you looked at what version of TLS is being negotiated? If you can get a packet trace (eliding certificates) that might help.
Looks like we never added unit/CI tests for tls-client-*
to libostree =/
Can you try e.g. attaching strace -f -o /tmp/strace-rpmostree.log -s 2048 -p $(systemctl show -p MainPID rpm-ostreed | cut -f 2 -d =)
and looking at the end of /tmp/strace-rpmostree.log
? (Don't paste the full thing here as it's likely to contain certificate material; but it'd be useful to know if we're e.g. getting EPERM
from open()
or something else)
It looks like the process has problems accessing the cert file:
openat(AT_FDCWD, "/root/.cert/gateway.crt", O_RDONLY) = -1 ENOENT (No such file or directory)
Directly after this thee above mentioned error message is send out.
Apparently the file is there and is correctly picked up by curl and ostree:
# ls -laZ /root/.cert
-rw-r-----. 1 root 1001 system_u:object_r:home_cert_t:s0 1334 Apr 1 16:24 gateway.crt
-rw-r--r--. 1 root root system_u:object_r:home_cert_t:s0 1058 Apr 1 15:12 gateway.csr
-rw-------. 1 root root system_u:object_r:home_cert_t:s0 1675 Apr 1 15:12 gateway.key
Could it be that rpm-ostree and SELinux are the root cause? Could there be some access control problem here?
EDIT: SELinux is unlikely because the audit log is clean and setenforce 0
does not help either.
It's likely https://github.com/coreos/rpm-ostree/commit/341ec7d0446a0505d5a4e1747c2283d40ca4823b
The more correct thing here is to store those keys in /etc
, not /root
(aka /var/roothome
). You could put them in /etc/ostree/keys
for example.
Another short term workaround is to paste this into systemctl edit rpm-ostreed
:
[Service]
ProtectHome=no
Now, I do want to avoid regressions...if you argue strongly for it we can consider reverting. But I'd really like not to :smile:
(We could weaken this to ProtectHome=read-only
for example)
Ohhh. Well that makes sense. I suppose it's a good thing to use that protective matters and no I don't have strong arguments against it. I'll have to update our firmware and write a migration but that's hardly a strong argument (for anyone else but me at least :D)
Well that's that I suppose. Thanks for helping me figure this out, much appreciated.
So I'm pretty sure now that this is a bug in
rpm-ostree
. I'm using an mTLS remote which does not work in current releases ofrpm-ostree
. The error iserror: While pulling ref: While fetching https://foo/repo/summary.sig: [58] Problem with the local SSL certificate
.Host system details
Expected vs actual behavior
With an mTLS remote configured:
Expected:
Steps to reproduce it
Example mTLS based remote config:
Possible pitfalls
The problem now could still be related to TLS specific stuff. Maybe the CA cert is malformed (which works in ostree and curl but not in rpm-ostree; unlikely but possible). Maybe the client cert is malformed. Maybe a specific algorithm or TLS version causes issues. Those are also things to explore.
System info - non-working vs. working
Non-working:
Working:
Would you like to work on the issue?
I'm not confident to work on this just yet, if that's indeed a bug in rpm-ostree.