Closed Downchuck closed 7 years ago
I’m not sure how we’re supposed to deal with this: this is a resource file we need and cannot go without. How would you propose we handle the problem?
Looks like it can be handled upstream with REQUESTS_CA_BUNDLE in the requests library I was using, using env and SparkFiles.get()
The technique certifi uses to find the pem file with where() fails in pySpark when using zip files
certifi.where() leads to this -- which is inside of the zip file, not accessible for reading: /tmp/spark-7b667869-6fa4-4a2d-af58-87f6218ce59f/userFiles-bc90eb3c-0796-4c83-844f-14333c8090b0/google-cloud-storage-py27-none-any.all.zip/certifi/cacert.pem
This cascades to breaking request. Found via google-cloud-storage library.