Closed Bronislawsky closed 2 years ago
_From Fresh install again here is what I notice
I have this error :
2020-12-11 20:40:00 (170 MB/s) - ‘/tmp/default’ saved [22330/22330]
chown: cannot access '/etc/nginx/sites-available/default': No such file or directory Restarting nginx (via systemctl): nginx.serviceJob for nginx.service failed because the control process exited with error code. See "systemctl status nginx.service" and "journalctl -xe" for details. failed! Installing (or renewing) free SSL certs from OpenSSL and Certbot... Generating a RSA private key .............................+++++ ...................+++++ writing new private key to '/etc/ssl/nginx.key'
The Cert fails because nginx isnt started
after a reboot, when I ran ./ss-encrypt, I got this error again
IMPORTANT NOTES:
The following errors were reported by the server:
Domain: DUMMY.xyz Type: connection Detail: Fetching https://DUMMY.xyz/.well-known/acme-challenge/jliZhyclnSxt9jY2Sut32ucawAebOE-ZNsF7v-1ra9w: Too many redirects
Domain: www.DUMMY.xyz Type: connection Detail: Fetching https://DUMMYxyz/.well-known/acme-challenge/K64u7vQypY1-KEkDLc3frkdpqDJ_wXR5YRUIAodo2kI: Too many redirects
I will run the script line by line and check where it fails....after I get some sleep._
I have found out why it failed..
Certbot prefert IPv6, so when AAAA records are set it used it and I guess there is something wrong in the nginx ipv6 redirect that creates an endless loop which leads to a fail..
anyways.. Deleting the AAAA records fixed it ( I believe because I haven't been able to try, too many requests but the --dry-run works )
in /etc/nginx/sites-avalaiable/DOMAIN.xyz
by uncommenting this line
listen [::]:443 ssl http2 ipv6only=on;
ipv6 responds without http cpde 301
I don't know why its been commented, but that seems to solve the certs issueing issue with ipv6
**Using the webroot path /var/www/html for all unmatched domains. Waiting for verification... Cleaning up challenges
IMPORTANT NOTES:
Thanks for your research and reporting @Bronislawsky
Several days ago we changed from using the default
server block file to using explicit server block names, e.g. example.com, staging.example.com and dev.example.com and then immediately afterward, I saw fatal errors with IPv6 and proceeded to comment out those same lines from our Nginx server block boilerplates that you noticed were problematic.
Certbot prefert IPv6, so when AAAA records are set it used it and I guess there is something wrong in the nginx ipv6 redirect that creates an endless loop which leads to a fail.. anyways.. Deleting the AAAA records fixed it ( I believe because I haven't been able to try, too many requests but the --dry-run works )
Interesting, I didn't know that Certbot prefers IPv6, that seems strange but great job discovering this!
Anyway I don't know why IPv6 was causing fatal errors on the server that I tested, it was an "old" SS installation that was recently updated via ss-update
and then ss-install-nginx
using the new boilerplates.
Maybe for fresh installs, that conflict can't be replicated, I'm not sure. It also could have been a fluke case...
Also, On most sites, I turn off dev & staging
STAGING_SITE="false" DEV_SITE="false"
Would it be possible not to run certbot on staging and dev when set to false because I believe if DNS aren't set pointing to the box this will fail and count as 2 fails which decrease the weekly limit for that particular domain.. which I believe its 50 per week.
I think there are 2 different issues here. The first is that Nginx had fatal errors with IPv6 enabled, but this is possibly related to the fact it was labeled a "default" server, but the old "default" block hadn't been deleted yet from my test server.
The other issue seems to be Certbot verifying the server properly over IPv6. I'm not sure if this is actually broken or not yet as I haven't had time to test... perhaps it is working fine if the former Nginx issue is addressed...
Alright so after some quick tests it was ipv6only=on
on the new staging
and dev
server blocks that was causing Nginx to fail, even though the production
server block seemed to work fine with that setting.
I might be a few years behind on that feature as it seems from my few minutes of research that Nginx might have changed the functionality of that snippet in the past few years or something...
For now I've removed ipv6only=on
from all SlickStack server blocks, so the conflict should be resolved for now. I'll research more when I have time and post back on this thread.
Future readers can pretty much ignore my earlier comments on this Issue, here is what matters:
ipv6only
was enabled by default if you are using single-line syntax in your listen directiveipv6only
whatsoeverOn top of all this, I found that we should add default_server
to the IPv6 listen directives in our "catch-all" server block, since previously we only defined that on the IPv4 listen directives, now it's like this:
server {
listen 80 default_server;
listen 443 ssl default_server;
listen [::]:80 default_server;
listen [::]:443 ssl default_server;
server_name _;
return 301 https://@SITE_DOMAIN$request_uri;
}
Ref: https://github.com/littlebizzy/slickstack/commit/ebcfe0f96c041e9b1f4dc169737ec374e0e238b7
Even until this day I see tons of questions and blog posts about these settings, so apparently the entire world has been sufficiently confused by them, not just us! That makes me feel slightly better, but we were behind the times...
Ref: https://serverfault.com/questions/512054/globally-setting-ipv6only-off Ref: https://serverfault.com/questions/578648/properly-setting-up-a-default-nginx-server-for-https
On a side note, SlickStack has still been having trouble with Certbot on fresh installations, requiring users to run ss-install
twice instead of once... I'm not sure if this is related.
Certbot does prefer IPv6 if those DNS records exist, but if they don't exist, this problem still seems to happen. I'm wondering if their software scans the server for IPv6 or something which caused our Nginx "catch-all" to fail previously or something, but even still the OpenSSL cert should already exist and be active before that... worth mentioning though.
FROM Fresh Install
There is something wrong with Certbot / letsencrypt simlink
I manually set SSL_TYPE to certbot because the wizard doesn't set it then I re-ran ss-install
and within the nginx.conf also, it doesnt seem right as it doesnt point to any letsencrypt files