Open vingar opened 2 weeks ago
do you run fetch-crl?
Vincent:
$ cat rest.py
#!/usr/bin/env python3
import os
import requests
id = "32485037-df6d-4c96-ab79-c409e0e2f238"
url = f'https://cmsdcatape.fnal.gov:3880/api/v1/tape/stage/{id}'
headers = {'Content-Type': 'application/json'}
uid = os.getuid()
cert_path = f'/tmp/x509up_u{uid}'
response = requests.get(url, headers=headers, cert=(cert_path, cert_path), verify='/etc/grid-security/certificates')
print(response.text)
print(response)
running withot voms proxy:
$ python3 rest.py
Traceback (most recent call last):
File "/home/litvinse/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 710, in urlopen
chunked=chunked,
File "/home/litvinse/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/home/litvinse/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/home/litvinse/.local/lib/python3.6/site-packages/urllib3/connection.py", line 424, in connect
tls_in_tls=tls_in_tls,
File "/home/litvinse/.local/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 450, in ssl_wrap_socket
sock, context, tls_in_tls, server_hostname=server_hostname
File "/home/litvinse/.local/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib64/python3.6/ssl.py", line 365, in wrap_socket
_context=self, _session=session)
File "/usr/lib64/python3.6/ssl.py", line 776, in __init__
self.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
self._sslobj.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_CERTIFICATE_UNKNOWN] sslv3 alert certificate unknown (_ssl.c:877)
running with voms proxy:
$ voms-proxy-info
subject : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Dmitry Litvintsev/CN=UID:litvinse/CN=4175574056
issuer : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Dmitry Litvintsev/CN=UID:litvinse
identity : /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Dmitry Litvintsev/CN=UID:litvinse
type : RFC compliant proxy
strength : 2048 bits
path : /tmp/x509up_u8637
timeleft : 119:58:03
$ python3 rest.py
{"detail":"request 32485037-df6d-4c96-ab79-c409e0e2f238 not found","title":"Not Found","status":"404"}
<Response [404]>
do you run fetch-crl?
Every 6 hours.
@DmitryLitvintsev
ssl.SSLError: [SSL: SSLV3_ALERT_CERTIFICATE_UNKNOWN] sslv3 alert certificate unknown (_ssl.c:877)
Good catch. The error might indicate something on the client side as well, making it harder to debug..
What can be the corresponding error messages for such SSL errors in the dCache domain logs or access logs on the door and frontend?
As we understand from the discussion at Tier-1 support meeting, the certificate directory temporarily becomes empty. Can you configure that?
Yes, also, may I ask you how you update certificates? On our system we have never seen any issues.
# ls -al /etc/grid-security/
total 7616
drwxr-xr-x 5 root root 4096 Jun 20 12:14 .
drwxr-xr-x. 141 root root 12288 Jun 25 08:01 ..
lrwxrwxrwx 1 root root 21 Jun 20 11:44 certificates -> certificates-1.119NEW
drwxr-xr-x 2 root root 40960 Jun 25 11:45 certificates-1.119NEW
...
The /etc/grid-security/security
is a soft link to /etc/grid-security/certificates-1.119NEW
The CRLs are updated by cron:
10 * * * * root [ ! -f /var/lock/subsys/osg-update-certs-cron ] || /usr/sbin/osg-update-certs --random-sleep 2700 --called-from-cron > /dev/null 2>&1
provided by osg-ca-scripts
package. It works like so: it creates a new directory, fills it up, and then moves symbolic link to it, then it removes old directory which is no longer visible to applications. It never failed to work with dCache.
We have some updates:
As we understand from the discussion at Tier-1 support meeting, the certificate directory temporarily becomes empty. Can you configure that?
To clarify the cert directory has not been observed empty during issue time or after. In the meeting we were discussing possible scenarios where the CRL availability is compromised.
Yes, also, may I ask you how you update certificates? On our system we have never seen any issues.
# ls -al /etc/grid-security/ total 7616 drwxr-xr-x 5 root root 4096 Jun 20 12:14 . drwxr-xr-x. 141 root root 12288 Jun 25 08:01 .. lrwxrwxrwx 1 root root 21 Jun 20 11:44 certificates -> certificates-1.119NEW drwxr-xr-x 2 root root 40960 Jun 25 11:45 certificates-1.119NEW ...
The
/etc/grid-security/security
is a soft link to/etc/grid-security/certificates-1.119NEW
The CRLs are updated by cron:10 * * * * root [ ! -f /var/lock/subsys/osg-update-certs-cron ] || /usr/sbin/osg-update-certs --random-sleep 2700 --called-from-cron > /dev/null 2>&1
provided by
osg-ca-scripts
package. It works like so: it creates a new directory, fills it up, and then moves symbolic link to it, then it removes old directory which is no longer visible to applications. It never failed to work with dCache.
There is no symlink pointing on /etc/grid-security/certificates
and fetch-crls
runs directly against /etc/grid-security/certificates
. Thanks for sharing your configuration
Hello,
We observed some mass transient SSL errors when FTS queries the status of staging requests against the frontend servers:
See. https://fts.usatlas.bnl.gov:8449/fts3/ftsmon/#/job/fd576cd4-2c56-11ef-8623-00163e1051a4
It seems to correspond to an error when the server fails to validate the client certificate and its certification authority. These servers also host gPlazma, and we did not observe any authentication failures at that time. We are thinking that it might correspond to an issue when the CRLs are renewed and reloaded on the frontends.
This error message could be reproduced by having an empty
/etc/grid-security/certificates
directory on the frontend with the python code below.Any help appreciated.