Open wijnanjo opened 3 months ago
Thanks for the report!
Is this in a custom image, or publicly available image?
Hopefully this should be resolvable by managing the system cert bundle in the container. Will keep an eye out if this seems to become a common issue.
It's a custom image based on almalinux:8. If you know anything I could try to reproduce, just let me know. Below more details:
Inside container, install pants and trigger the error:
[root@945709884f0a ~]# curl --proto '=https' --tlsv1.2 -fsSL https://static.pantsbuild.org/setup/get-pants.sh > get-pants.sh
[root@945709884f0a ~]# chmod +x get-pants.sh
[root@945709884f0a ~]# ./get-pants.sh --bin-dir /usr/bin
Downloading and installing the pants launcher ...
Installed the pants launcher from https://github.com/pantsbuild/scie-pants/releases/latest/download/scie-pants-linux-x86_64 to /usr/bin/pants
Running `pants` in a Pants-enabled repo will use the version of Pants configured for that repo.
In a repo not yet Pants-enabled, it will prompt you to set up Pants for that repo.
[root@945709884f0a ~]# PANTS_BOOTSTRAP_TOOLS=2 pants bootstrap-cache-key
No Pants configuration was found at or above /root.
Would you like to configure /root as a Pants project? (Y/n): y
Fetching latest stable Pants version since none is configured
Failed to determine release URL for Pants: 2.21.0: pants.2.21.0-cp39-linux_x86_64.pex: URL check failed: https://github.com/pantsbuild/pants/releases/download/release_2.21.0/pants.2.21.0-cp39-linux_x86_64.pex: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)>
If this is unexpected (you are using a known good Pants version), try upgrading scie-pants first.
It may also be that the platform linux_x86_64 isn't supported for this version of Pants, or some other intermittent network/service issue.
To get help, please visit: https://www.pantsbuild.org/community/getting-help
Error: Failed to establish atomic directory /root/.cache/nce/ab1acf935c4cc43338c604ae7d0f6aa2419f2415d94eb9cae381601dbba70a61/locks/configure-0ba7130ce931127c3acb9de7b10af42bc36f4aa17a1a92c577efd4252dbe6b1e. Population of work directory failed: Boot binding command failed: exit status: 1
Isolates your Pants from the elements.
Please select from the following boot commands:
<default> (when SCIE_BOOT is not set in the environment) Detects the current Pants installation and launches it.
bootstrap-tools Introspection tools for the Pants bootstrap process.
update Update scie-pants.
You can select a boot command by setting the SCIE_BOOT environment variable.
Cannot reproduce in REPL
[root@945709884f0a ~]# python3
Python 3.9.19 (main, May 30 2024, 13:03:52)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> pex_url="https://github.com/pantsbuild/pants/releases/download/release_2.21.0/pants.2.21.0-cp39-linux_x86_64.pex"
>>> req = urllib.request.Request(pex_url, method="HEAD")
>>> TIMEOUT=10
>>> resp = urllib.request.urlopen(req, timeout=TIMEOUT)
>>> resp.code
200
hmm.. I wonder if this may be hinged on certain env vars being set?
what happens if you try the above using env -i python3
?
Alright I did some investigation and you can reproduce the error in the public almalinux:8 image. The pants launcher runs the problematic python code in a PEX venv and it is in this environment that the certificate error pops up.
steps to reproduce
get-pants.sh
and PANTS_BOOTSTRAP_TOOLS=2 pants bootstrap-cache-key
. This triggers the error (we already know that) but we now have that PEX venv somewhere under ~/.cache/nce/...../bindings/pex_root/venvs/...[root@1b63170a1b5f 327f457f25e419eeb572d4624fc6cd8ccb017a05]# pwd
/root/.cache/nce/ab1acf935c4cc43338c604ae7d0f6aa2419f2415d94eb9cae381601dbba70a61/bindings/pex_root/venvs/10065e08ffb3b880c0f2fc1126cc528a64b9e10b/327f457f25e419eeb572d4624fc6cd8ccb017a05
[root@1b63170a1b5f 327f457f25e419eeb572d4624fc6cd8ccb017a05]# source bin/activate
(tools.pex) [root@1b63170a1b5f 327f457f25e419eeb572d4624fc6cd8ccb017a05]# cat test.py
#!/bin/python
from urllib.request import urlopen
urlopen("https://github.com/pantsbuild/pants/releases/download/release_2.21.0/pants.2.21.0-cp39-linux_x86_64.pex")
(tools.pex) [root@1b63170a1b5f 327f457f25e419eeb572d4624fc6cd8ccb017a05]# python test.py
BOOM
a possible fix (note we're still in the venv)
python -m ensurepip --upgrade
pip3 install certifi
cat test2.py
#!/bin/python
from urllib.request import urlopen
import certifi
import ssl
certifi_context = ssl.create_default_context(cafile=certifi.where())
urlopen("https://github.com/pantsbuild/pants/releases/download/release_2.21.0/pants.2.21.0-cp39-linux_x86_64.pex", context=certifi_context)
# and now it works!
python test2.py
Thanks for the follow up with repro and possible fix! 💯 This makes it much more actionable for us 👍🏽
Almalinux is in the Fedora family, and stores its ca certs in /etc/pki/tls
. The distro python is patched/configured for this:
>>> ssl.get_default_verify_paths()
DefaultVerifyPaths(cafile='/etc/pki/tls/cert.pem', capath='/etc/pki/tls/certs', openssl_cafile_env='SSL_CERT_FILE', openssl_cafile='/etc/pki/tls/cert.pem', openssl_capath_env='SSL_CERT_DIR', openssl_capath='/etc/pki/tls/certs')
but the python interpreter we fetch is not:
>>> ssl.get_default_verify_paths()
DefaultVerifyPaths(cafile=None, capath='/etc/ssl/certs', openssl_cafile_env='SSL_CERT_FILE', openssl_cafile='/etc/ssl/cert.pem', openssl_capath_env='SSL_CERT_DIR', openssl_capath='/etc/ssl/certs')
/etc/ssl/certs
contains /etc/ssl/certs/ca-bundle.crt
(which is valid as a cafile
) but doesn't have the individual certs that a capath
would expect
manually setting the cafile for python looks to work, either with an envvar (SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt ~/.local/bin/pants
) or with the ssl context. I'm not sure where/when we would detect that this needs to be added.
Our CI suddenly started failing with this error:
The same CI setup has been running successfully for a few months now. We run this in a container with python 3.9.18
CI starts by running the
get-pants.sh
and then calculates a cache-key usingPANTS_BOOTSTRAP_TOOLS=2 pants bootstrap-cache-key
and that's where it fails. I guess on this lineI tried to manually reproduce the error (via python urllib and doing a 'HEAD' request on the pex url) and that just works.
My current workaround is by reverting to the previous scie-pants version (
get-pants --version 0.11.0
)