cnr-ibf-pa / hbp-bsp-issues

Ticketing system for developers/testers and power users of the Brain Simulation Platform of the Human Brain Project
4 stars 0 forks source link

Posting Job to Piz-Daint SSLError (bad handshake) using Python #452

Closed antonelepfl closed 4 years ago

antonelepfl commented 4 years ago

Hi @BerndSchuller, With @alex4200 we are running some test using python and Unicore API and we are getting

requests.exceptions.SSLError: HTTPSConnectionPool(host='brissago.cscs.ch', port=8080): Max retries exceeded with url: /DAINT-CSCS/rest/core/jobs (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)'),))

Or 

HTTPSConnectionPool(host='brissago.cscs.ch', port=8080): Max retries exceeded with url: /DAINT-CSCS/rest/core/jobs (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))

To test we are doing something like:

import requests

url = 'https://brissago.cscs.ch:8080/DAINT-CSCS/rest/core/jobs'

FROM_PIZDAINT = 'BFT:https://zam2125.zam.kfa-juelich.de:9112/FZJ_JURECA/services/StorageManagement?res=0f37bb0d-6b13-4629-a873-39b44ab78109-uspace#/'

data = {
            "Name":"Test Analysis PizDaint",
            "Executable":"/bin/bash input.sh",
            "haveClientStageIn":"true",
            "Resources":{"Nodes":2, "CPUsPerNode":36, "Runtime":500, "NodeConstraints":"mc", "Memory":"64000M","CPUs":72},
            "Tags":["analysis"],
            "Imports":[
                {"From":FROM_PIZDAINT + "mc2_Column_report_0.bbp", "To":"mc2_Column_report_0.bbp"},
                {"From":FROM_PIZDAINT + "out.dat", "To":"out.dat"},
                {"From":FROM_PIZDAINT + "BlueConfig", "To":"BlueConfig"}
            ]
        }

headers = {
  'Authorization': 'Bearer eyJh...w1LgEcsa18',
}

res = requests.post(url, json=data, headers=headers)

Using the Simulation GUI we don't see any of these errors, is there something that we worry about? I see that maybe there is an issue on the TSI part

(tls_process_server_certificate, certificate verify failed)

What do you think?

BerndSchuller commented 4 years ago

hi,

this is pretty easy to explain and fix. The certification authority used by CSCS is not trusted by the python code. The UNICORE API avoids this by adding verify=False to the requests call, i.e.

res = requests.post(url, json=data, headers=headers, verify=False)

should do the trick.

alex4200 commented 4 years ago

@BerndSchuller Yes that works, but is disabling the certification check a bad practice? Shouldn't the untrusted authority added to the python code somehow? Would that be the better solution?

BerndSchuller commented 4 years ago

Is this code running in a jupyter notebook in the collab? The configured truststore indeed should contain the CA used by CSCS.

In the collab notebooks accessing "https://hbp-unic.fz-juelich.de:9112" works fine, "https://brissago.cscs.ch:8080" fails to validate.

It might be a good idea to open a ticket, so the containers running the notebooks can be updated.

Personally I consider the risk of not validating the certificates pretty low, but in general, switching on validation is indeed best practice.

alex4200 commented 4 years ago

No, the code does not usually run from inside a jupyter notebook. They run on my local linux machine, or from within a jenkins instance.

BerndSchuller commented 4 years ago

Then you're on your own, I guess. To enable validation, you'd need to collect the CAs that are used by the servers you want to access and either put them in the system truststore, or configure requests so that they are found.

you can see all certificate info using

openssl s_client -connect brissago.cscs.ch:8080

That server uses the "QuoVadis Global SSL ICA G2" certification authority, and you can get the certificate here: https://www.quovadisglobal.com/QVRepository/DownloadRootsAndCRL/QuoVadisGlobalSSLICAG2-PEM.aspx

Probably not really worth the effort, especially in a testing / local scenario.

alex4200 commented 4 years ago

Thanks for the info. I might look into that a bit more, or just use your suggested solution.

I will close this ticket for now.