Open ratijas opened 3 years ago
Because head
works for you (thanks for that test), this is different from #1267.
By providing --entrypoint, you're skipping the step of changing user from root
to znc
. Perhaps znc
user doesn't have enough permissions to read one of directories in path to /etc/letsencrypt/live/ratijas.tk/fullchain.pem
?
You are goddamn right! (And I should have had some sleep before posting issues)
From now on, this issue will be split in two.
The first thing is, there are different permissions for different types of files under /etc/letsencrypt/archive/domain/
. The private key is the only one with most restrictive mode ower: rw, others: nothing.
$ # Inside znc:1.8.2 docker container
$ cd /etc/letsencrypt/archive/ratijas.tk
$ ls -lh *11*
-rw-r--r-- 1 root root 1.9K Nov 29 07:12 cert11.pem
-rw-r--r-- 1 root root 1.7K Nov 29 07:12 chain11.pem
-rw-r--r-- 1 root root 3.5K Nov 29 07:12 fullchain11.pem
-rw------- 1 root root 1.7K Nov 29 07:12 privkey11.pem
$ su-exec znc:znc head -n1 cert11.pem
-----BEGIN CERTIFICATE-----
$ su-exec znc:znc head -n1 privkey11.pem
head: privkey11.pem: Permission denied
Now, I wonder about two things:
nginx:nginx
. But it's an option, meaning it's there for our convenience, but not enabled by default.nginx:nginx
user.
me@server$ docker run --rm -it \
-v /etc/letsencrypt:/etc/letsencrypt:ro \
--workdir /etc/letsencrypt/archive/ratijas.tk \
--entrypoint=/bin/bash nginx:latest
root@docker$ head -n1 privkey11.pem
-----BEGIN PRIVATE KEY-----
root@docker$ su -s /bin/bash -g nginx nginx
nginx@docker$ id
uid=101(nginx) gid=101(nginx) groups=101(nginx)
nginx@docker$ head -n1 privkey11.pem
head: cannot open 'privkey11.pem' for reading: Permission denied
Here lies a problem. The code is so busy checking for file's mere "existence" that it forgot what it was supposed do.
Checking for file status before making an attempt to open it leads to inherent race conditions and poor control flow with redundant error checking. There are plenty of articles on this topic (just google for TOCTOU), I don't have to repeat myself, but the short of it is this:
Checking for file status before making an attempt to open it leads to inherent race conditions and poor control flow with redundant error checking.
Well, we can just drop the check for that file's existence. This replaces the error from "ZNC refuses to start since it cannot access the SSL certificate" with "ZNC closes all incoming SSL connections without answering the handshake and no clear error message anywhere". The socket library that ZNC uses loads the certificate for each incoming connection.
(Well, if you run ZNC in --debug
mode, one of these messages become visible: https://github.com/jimloco/Csocket/blob/e8d9e0bb248c521c2c7fa01e1c6a116d929c41b4/Csocket.cc#L1858-L1884)
(Oh and while we are talking about "do not use access
": Feel free to open an issue on Csocket: https://github.com/jimloco/Csocket/blob/e8d9e0bb248c521c2c7fa01e1c6a116d929c41b4/Csocket.cc#L1842-L1854)
Also, this is not a security check. We are not checking the file's owner or doing magic. This is just a convenience feature / quality of life thing. If you read the wikipedia article for TOCTOU, it all sounds quite security-centric.
This replaces the error from "ZNC refuses to start since it cannot access the SSL certificate" with "ZNC closes all incoming SSL connections without answering the handshake and no clear error message anywhere".
What are you talking about? In this case instead of reporting the real underlying problem of opening a file, it prints custom misleading error message about file purportedly being not found.
This check does nothing right. If the point is to verify that https/SSL is set up correctly — how about doing self-test GET request? It would prove much more than simply file being found or not.
If you read the wikipedia article for TOCTOU, it all sounds quite security-centric.
"Sounds"? I am not officially qualified to make comments on security either. Are you 100% sure it can not be misused (apart from already masking real errors, that is)?
Let's see. When the code takes on "Unable to locate pem file" branch, it then may attempt to prompt user with message "Would you like to create a new pem file?". If the answer is "yes", program would attempt to write to that same pem file as returned by the same CZNC::GetPemLocation()
function. Let's say, what would happen if at that exact moment certbot on the host triggered by a periodic systemd timer updates certificates and re-creates links? I don't know how certbot works — it may replace soft links atomically, it may not. What it ZNC creates it's PEM file at exact moment between deletion and creation? I totally don't know. Sounds like an undefined behavior to me.
What would be the downsides of running ZNC as root in docker?
That's a security risk? It's easier to escape the container as root than as non-root.
it prints custom misleading error message about file purportedly being not found.
I agree, the message could be improved to tell what exactly went wrong.
This check does nothing right.
PRs with fixes are welcome.
how about doing self-test GET request?
That could work. I'm not sure it worth though. Does nginx do anything like that?
Again, we could just remove the check. But currently ZNC reads the key upon accepting a connection, and the current error message is better than failing to start handshake, because that's harder to debug. The proper fix would be to read the file on startup, and provide a way for certbot to trigger re-reading it.
Are you 100% sure it can not be misused (apart from already masking real errors, that is)?
Correct.
If the answer is "yes", program would attempt to write to that same pem file as returned by the same CZNC::GetPemLocation() function.
If that file doesn't... khem... exist due to its location being inaccessible, then ZNC won't be able to write to it either.
That could work. I'm not sure it worth though. Does nginx do anything like that?
Nginx reads certificate & keys once. And only reloads them on configuration change (signal nginx -s reload
). It doesn't just check for file existence, but actually loads the keys. So if that went smoothly, I'm sure it will be alright from that point onward.
Again, we could just remove the check. But currently ZNC reads the key upon accepting a connection, and the current error message is better than failing to start handshake, because that's harder to debug. The proper fix would be to read the file on startup, and provide a way for certbot to trigger re-reading it.
Yeah, that's kinda wasteful. Definitely something to improve on.
Are you 100% sure it can not be misused (apart from already masking real errors, that is)?
Correct.
How may I verify this statement for correctness?
If that file doesn't... khem... exist due to its location being inaccessible, then ZNC won't be able to write to it either.
Think carefully, think twice.
Think carefully, think twice.
Could you be less aggressive, please?
Step 3.5 in your scenario: user clicks Yes on that prompt. Does the user not know that he configured certbot before? If user doesn't know what he's doing, we can't protect from that. User has access to run ZNC and/or to overwrite any files he pleases.
Anyway, this is dupe of #1215 - when ZNC gains ability to read the cert once, on startup, it won't need such separate check on startup.
Yeah, sorry. I'm just feeling terribly annoyed, because the first step toward a solution is acknowledging the problem at hands, which is like... not happening here.
Does the user not know that he configured certbot before?
First, I don't ever assume that users knows what they are doing. Second, they might know, and might not; they might even not be administrators (znc is designed to be run as non-root anyway). I, for example, share my server with friends to run some web stuff like znc, and I am the one doing DNS and certificates management — I certainly wouldn't wanna let them mess it up even by pure accident, i.e. without malicious intentions.
Anyway, this is dupe of #1215 - when ZNC gains ability to read the cert once, on startup, it won't need such separate check on startup.
Not much of a duplicate, as I've raised totally different concerns here. Although a good solution might resolve both issues at once. Thanks for pointing out, I'll take a look at what can be done there as well.
Closing this means that (1) an error message became adequate, (2) race conditions are eliminated. Possibly, (3) an option to run ZNC as root or otherwise integrate nicely with letsencrypt/certbot out-of-the-box. For option (3) updating some documentation on ZNC wiki wouldn't hurt.
Thus, please, consider reopening.
This issue seems to only describe half of what could be the problem. Under OpenBSD 6.8 I was experiencing the following scenario:
I placed the cert into the .znc
directory and added it to the config:
SSLCertFile = ~/.znc/cert1.pem
znc kept nagging that it wasn't able to locate the pem file, even though the same user it was running under could cat
the file. At some point I didn't know what to do and simply pressed y
, to let znc create the pem. This was what it resulted in:
Turns out that in my config I had a whitespace at the end of the certificate name. I kid you not. My config was SSLCertFile = ~/.znc/cert1.pem
instead of SSLCertFile = ~/.znc/cert1.pem
. Hence znc wasn't unable to find the pem and so it created a file that had a whitespace at the end of its name.
Summary
After splitting config with single pem into separate key/dhparam/certificate files (or rather, leaving them as is provided by letsencrypt/certbot), mounting them on a docker volume (local bind to
/etc/letsencrypt), and upgrading to 1.8.2 (presumably from 1.7.x — something tagged
latest` from 2 years earlier)... official Docker image of znc failed to start.And when running manually with equivalent set of options, znc presents interactive dialog:
However, the pem file sure exists. It can be opened and read, as demonstrated below.
Version
Docker:
ZNC:
znc:1.8.2
Related
Issue #1267 looks exactly like mine, but it was closed for reasons either not explained, or the ones I gon't understand.