certifi / certifi.io

Public website for the Certifi Project.
http://certifi.io
58 stars 5 forks source link

certif.where fails in pyspark #14

Closed Downchuck closed 7 years ago

Downchuck commented 7 years ago

The technique certifi uses to find the pem file with where() fails in pySpark when using zip files

certifi.where() leads to this -- which is inside of the zip file, not accessible for reading: /tmp/spark-7b667869-6fa4-4a2d-af58-87f6218ce59f/userFiles-bc90eb3c-0796-4c83-844f-14333c8090b0/google-cloud-storage-py27-none-any.all.zip/certifi/cacert.pem

This cascades to breaking request. Found via google-cloud-storage library.

Lukasa commented 7 years ago

I’m not sure how we’re supposed to deal with this: this is a resource file we need and cannot go without. How would you propose we handle the problem?

Downchuck commented 7 years ago

Looks like it can be handled upstream with REQUESTS_CA_BUNDLE in the requests library I was using, using env and SparkFiles.get()