Just wanted to leave a quick bug report (that might be more of a documentation issue than a code issue). I noticed that by default the Caddyfile generated here uses Let's Encrypt / ACME to fetch an SSL certificate. However, ACME requires that Caddy perform a challenge to verify domain ownership. If the reverse proxy is not on a publicly accessible server both of the main challenge types out there will fail (i.e., HTTP-01 fails because Let's Encrypt can't reach the server; DNS-01 fails because the DNS record is internal-only).
One workaround would be to set the value of the tls environment variable to tls internal (either by adding export tls="tls internal" to /usr/bin/nextpyp-startrprox or by using a systemd unit file drop-in to alter the environment variables using systemd) to use a local CA. One could also modify the Caddyfile to use tls [<cert_file> <key_file> although this only would work if you had already had a certificate issued by a recognized CA (e.g., InCommon).
With tls internal, however, pyp jobs will generally fail with errors like this:
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/envs/pyp/lib/python3.8/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/usr/local/envs/pyp/lib/python3.8/site-packages/urllib3/connectionpool.py", line 798, in urlopen
retries = retries.increment(
File "/usr/local/envs/pyp/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='nextpyp.semc.nysbc.org', port=443): Max retries exceeded with url: /pyp (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)')))
In this case, they'll only work if you use an insecure connection by updating config.toml to bypass the reverse proxy:
# URL of the web server, from a SLURM compute node's point of view
webhost = 'http://nextpyp.myinternaldomain.com:8080'
My apologies- this issue appears to be documented here. I had read the 0.5.0 documentation earlier, which did not seem to discuss this particular scenario.
Hi again,
Just wanted to leave a quick bug report (that might be more of a documentation issue than a code issue). I noticed that by default the Caddyfile generated here uses Let's Encrypt / ACME to fetch an SSL certificate. However, ACME requires that Caddy perform a challenge to verify domain ownership. If the reverse proxy is not on a publicly accessible server both of the main challenge types out there will fail (i.e., HTTP-01 fails because Let's Encrypt can't reach the server; DNS-01 fails because the DNS record is internal-only).
One workaround would be to set the value of the
tls
environment variable totls internal
(either by addingexport tls="tls internal"
to/usr/bin/nextpyp-startrprox
or by using a systemd unit file drop-in to alter the environment variables using systemd) to use a local CA. One could also modify the Caddyfile to usetls [<cert_file> <key_file>
although this only would work if you had already had a certificate issued by a recognized CA (e.g., InCommon).With
tls internal
, however,pyp
jobs will generally fail with errors like this:In this case, they'll only work if you use an insecure connection by updating
config.toml
to bypass the reverse proxy:--John