jitsi / docker-jitsi-meet

Jitsi Meet on Docker
https://hub.docker.com/u/jitsi/
Apache License 2.0
3.06k stars 1.36k forks source link

LE logic bug on startup #1140

Open Cyborgscode opened 3 years ago

Cyborgscode commented 3 years ago

The following patch solves:

Additional Requirement: ( that is the main reason, why it's not a pull request)

openssl inside dockercontainer

File:

docker-jitsi-meet/web/rootfs/etc/cont-init.d/10-config

Diff:

jitsi-meet-10-config.diff.gz

Cyborgscode commented 3 years ago

Main Punch line:

EXPIRE="Certificate will expire"
if [ -x /config/acme-certs/$LETSENCRYPT_DOMAIN/fullchain.pem ];then 
    EXPIRE=$(openssl x509 -noout -in /config/acme-certs/$LETSENCRYPT_DOMAIN/fullchain.pem -checkend 0)
fi
if [ "$EXPIRE" == "Certificate will expire" ]; then 
          ... normal LE startupcode ...
saghul commented 3 years ago

Can you please make this an actual PR so we can discuss the code instead of pasting diff chunks here? It will be much more productive.

saghul commented 3 years ago

Now that I'm fully awake... not sure I understand why you need this. acme.sh already checks if a renewal is necessary.

Cyborgscode commented 3 years ago

There were two problems.

a) i had LE enabled for a subdomain and it worked for a longer periode in 2020->2021, then one day, LE counted activities on other subdomains of the same sld as an activity which included this jitsi subdomain as well and rate-limited it. From that point on, the docker container could no longer start. On every start it tried LE and failed and stopped. This would not happen, if the once created Certs had been checked. That docker container was regulary updated, but may no represent the code from today.

We had to switch and overwrite the files in /config/web/keys/cert.* with wildcard certs from the maindomain. (...and later reinstalled a new docker image.)

if you can ensure, that acme.sh really handles the cert validity check correctly, this patch is in deed not necessary. In this case, i would suggest to add a comment line for a third returncode of acme.sh , which explains the "third state" of valid, already present certs, because the code does not imply this to the naked eye.

b) the entire concept of checking those le certs or static cert.* does not work, if the config is freshly regenerated. The supplied fix does not handle this part, the admin's updatescript needs to handle this.

The LE rate limit section in the actual selfhosting guide gives a good impression on the problem, but should mention this explicit as a problem for the update process. Example for static keys:

docker-compose down git pull rm -fr ~/.jitsi-meet-cfg/ mkdir -p ~/.jitsi-meet-cfg/{web/letsencrypt,web/keys,transcripts,prosody,jicofo,jvb,jigasi,jibri} cp ~/jitsi-meet-certs/cert.* ~/.jitsi-meet-cfg/web/keys/ # example docker-compose up -d

saghul commented 3 years ago

a) i had LE enabled for a subdomain and it worked for a longer periode in 2020->2021, then one day, LE counted activities on other subdomains of the same sld as an activity which included this jitsi subdomain as well and rate-limited it. From that point on, the docker container could no longer start. On every start it tried LE and failed and stopped. This would not happen, if the once created Certs had been checked. That docker container was regulary updated, but may no represent the code from today.

We handle acme errors here: https://github.com/jitsi/docker-jitsi-meet/blob/76a16a8b78b309aa71ee295756fc39ca3fe80f25/web/rootfs/etc/cont-init.d/10-config#L32-L53 as you can see we only handle return codes of 0 or 1, if a cert is not due for renewal it will return 2: https://github.com/acmesh-official/acme.sh/blob/5b0d6a13759987bada1715c17f014b60dfd3c358/acme.sh#L87

There is, however, a corner case I think: let's say a cert is bound for renewal, but due to rate limiting we fail to do so (maybe the user was testing with some other subdomain that now counts towards that?). The operation will fail and the container won't start. This could be improved with your check, by checking that the cert is still valid, and the error can be skipped, since the renewal cron job will eventually run and properly renew it.

the entire concept of checking those le certs or static cert.* does not work, if the config is freshly regenerated. The supplied fix does not handle this part, the admin's updatescript needs to handle this.

The LE rate limit section in the actual selfhosting guide gives a good impression on the problem, but should mention this explicit as a problem for the update process. Example for static keys:

docker-compose down git pull rm -fr ~/.jitsi-meet-cfg/ mkdir -p ~/.jitsi-meet-cfg/{web/letsencrypt,web/keys,transcripts,prosody,jicofo,jvb,jigasi,jibri} cp ~/jitsi-meet-certs/cert.* ~/.jitsi-meet-cfg/web/keys/ # example docker-compose up -d

Not sure I understand the problem. When manual keys are used (something not really documented, on purpose) the user is responsible for managing such keys. There is not much we can do.

If you are already creating your keys elsewhere the recommendation has always been to do your own proxy and TLS offloading, and reach this setup via HTTP.

Cyborgscode commented 3 years ago

Your right, that it's not your concern if someone uses a static cert, because he wants to skip LE or has a commercial cert, but mentioning problems one will run into if one uses a feature "you" implemented, is always a good idea ;)

The LE auto-create feature was really a cool way to start with jitsi-meet, until the LE problems started. I just thought about the LE update process and maybe "0" seconds until recreating the cert maybe a bit short.

"-checkend 86400" may be more suited. On the other hand, if the patch is combined with the return of acme.sh rc 0/1, it would be possible to skip the "creation fail" exit, if the "previouse certs" have been copied aside, are still valid and can be reused again. In this, or the failure at all, case, a mail could be created to inform the admin ( LETSENCRYPT_EMAIL ) about the auto-create failure . This would really improve the situation.

From my experience yesterday, I also think, it's a good idea to log which certs are used on startup, so it's easier to debug things.

I would like to test and rewrite the scripts, but i have my doubts, that I'm able to recreate the docker image ( never done it ) and pimp it, as my base os is Fedora, not Debian/Ubuntu. So installing new apps into this image could be a problem. Any suggestions?