Several attempts to download files from the EIA.gov website have failed due to a verification failure of SSL certificates on Linux OS running Python 3.8.10. I'm not able to see a clear solution on the user's end, but I'm happy to try any recommended methods.
In looking through the electricityLCI code, I noticed that one of your required packages is requests; however, in eia_trans_dist_grid_loss.py, the native Python urllib.request module is being used.
Would it be possible to implement one of the two following solutions in the event that a URLError or SSLCertVerificationError is caught during the try-except block of eia_trans_dist_download_extract?
Option 1
You can maintain the use of urllib.requests by also importing ssl and creating a context that by-passes this SLL verification (see example code below).
# Source: 2018-06-25 by L Ma; https://datumorphism.leima.is/til/data/python-urllib-ssl/
# Import modules
import urllib.request
import ssl
import os
# Test URL
my_url = "https://www.eia.gov/electricity/state/archive/2016/ohio/xls/oh.xlsx"
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
# Retrieve data
# The context = ctx will ignore the errors from certificates
html = urllib.request.urlopen(my_url, context=ctx).read()
my_file = os.path.basename(my_url)
with open(my_file, 'wb') as f:
f.write(html)
Option 2
Use the requests package that is already a dependency of electricityLCI. Note that I didn't receive any errors when using it.
import os
import requests
# Test URL
my_url = "https://www.eia.gov/electricity/state/archive/2016/ohio/xls/oh.xlsx"
my_file = os.path.basename(my_url)
r = requests.get(my_url)
with open(my_file, 'wb') as f:
f.write(r.content)
Error traceback
Tested with:
from electricitylci.aggregation_selector import subregion_col
from electricitylci import model_config as config
config.model_specs = config.build_model_class(elci_model)
from electricitylci import get_distribution_mix_df
subregion = 'BA'
sub_col = subregion_col(subregion)
sub_col_name = sub_col[0]
loss_data = get_distribution_mix_df(None, subregion)
and received these error statements:
Loading 2016 EIA-860 plant data from csv file
Downloading data for al
---------------------------------------------------------------------------
SSLCertVerificationError Traceback (most recent call last)
File /usr/lib/python3.8/urllib/request.py:1354, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1353 try:
-> 1354 h.request(req.get_method(), req.selector, req.data, headers,
1355 encode_chunked=req.has_header('Transfer-encoding'))
1356 except OSError as err: # timeout error
File /usr/lib/python3.8/http/client.py:1256, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1255 """Send a complete request to the server."""
-> 1256 self._send_request(method, url, body, headers, encode_chunked)
File /usr/lib/python3.8/http/client.py:1302, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1301 body = _encode(body, 'body')
-> 1302 self.endheaders(body, encode_chunked=encode_chunked)
File /usr/lib/python3.8/http/client.py:1251, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1250 raise CannotSendHeader()
-> 1251 self._send_output(message_body, encode_chunked=encode_chunked)
File /usr/lib/python3.8/http/client.py:1011, in HTTPConnection._send_output(self, message_body, encode_chunked)
1010 del self._buffer[:]
-> 1011 self.send(msg)
1013 if message_body is not None:
1014
1015 # create a consistent interface to message_body
File /usr/lib/python3.8/http/client.py:951, in HTTPConnection.send(self, data)
950 if self.auto_open:
--> 951 self.connect()
952 else:
File /usr/lib/python3.8/http/client.py:1425, in HTTPSConnection.connect(self)
1423 server_hostname = self.host
-> 1425 self.sock = self._context.wrap_socket(self.sock,
1426 server_hostname=server_hostname)
File /usr/lib/python3.8/ssl.py:500, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
494 def wrap_socket(self, sock, server_side=False,
495 do_handshake_on_connect=True,
496 suppress_ragged_eofs=True,
497 server_hostname=None, session=None):
498 # SSLSocket class handles server_hostname encoding before it calls
499 # ctx._wrap_socket()
--> 500 return self.sslsocket_class._create(
501 sock=sock,
502 server_side=server_side,
503 do_handshake_on_connect=do_handshake_on_connect,
504 suppress_ragged_eofs=suppress_ragged_eofs,
505 server_hostname=server_hostname,
506 context=self,
507 session=session
508 )
File /usr/lib/python3.8/ssl.py:1040, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
1039 raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1040 self.do_handshake()
1041 except (OSError, ValueError):
File /usr/lib/python3.8/ssl.py:1309, in SSLSocket.do_handshake(self, block)
1308 self.settimeout(None)
-> 1309 self._sslobj.do_handshake()
1310 finally:
SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 loss_data = get_distribution_mix_df(df, subregion)
File ~/Envs/test/lib/python3.8/site-packages/electricitylci/__init__.py:526, in get_distribution_mix_df(combined_df, subregion)
523 if subregion is None:
524 subregion = config.model_specs.regional_aggregation
--> 526 td_loss_df = tnd.generate_regional_grid_loss(
527 combined_df, config.model_specs.eia_gen_year, subregion=subregion
528 )
529 return td_loss_df
File ~/Envs/test/lib/python3.8/site-packages/electricitylci/eia_trans_dist_grid_loss.py:222, in generate_regional_grid_loss(final_database, year, subregion)
220 plant_generation["FERC_Region"]=plant_generation["Balancing Authority Code"].map(ba_codes["FERC_Region"])
221 plant_generation["EIA_Region"]=plant_generation["Balancing Authority Code"].map(ba_codes["EIA_Region"])
--> 222 td_rates = eia_trans_dist_download_extract(f"{year}")
223 td_by_plant = pd.merge(
224 left=plant_generation,
225 right=td_rates,
(...)
228 how="left",
229 )
230 td_by_plant.dropna(subset=["t_d_losses"], inplace=True)
File ~/Envs/test/lib/python3.8/site-packages/electricitylci/eia_trans_dist_grid_loss.py:114, in eia_trans_dist_download_extract(year)
111 print(f"Downloading data for {state_abbrev[key]}")
113 try:
--> 114 urllib.request.urlretrieve(url, filename)
115 df = pd.read_excel(
116 filename,
117 sheet_name="10. Source-Disposition",
118 header=3,
119 index_col=0,
120 )
121 except XLRDError:
122 # The most current year has a different url - no "archive/year"
File /usr/lib/python3.8/urllib/request.py:247, in urlretrieve(url, filename, reporthook, data)
230 """
231 Retrieve a URL into a temporary location on disk.
232
(...)
243 data file as well as the resulting HTTPMessage object.
244 """
245 url_type, path = _splittype(url)
--> 247 with contextlib.closing(urlopen(url, data)) as fp:
248 headers = fp.info()
250 # Just return the local path and the "headers" for file://
251 # URLs. No sense in performing a copy unless requested.
File /usr/lib/python3.8/urllib/request.py:222, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
220 else:
221 opener = _opener
--> 222 return opener.open(url, data, timeout)
File /usr/lib/python3.8/urllib/request.py:525, in OpenerDirector.open(self, fullurl, data, timeout)
522 req = meth(req)
524 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 525 response = self._open(req, data)
527 # post-process response
528 meth_name = protocol+"_response"
File /usr/lib/python3.8/urllib/request.py:542, in OpenerDirector._open(self, req, data)
539 return result
541 protocol = req.type
--> 542 result = self._call_chain(self.handle_open, protocol, protocol +
543 '_open', req)
544 if result:
545 return result
File /usr/lib/python3.8/urllib/request.py:502, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
500 for handler in handlers:
501 func = getattr(handler, meth_name)
--> 502 result = func(*args)
503 if result is not None:
504 return result
File /usr/lib/python3.8/urllib/request.py:1397, in HTTPSHandler.https_open(self, req)
1396 def https_open(self, req):
-> 1397 return self.do_open(http.client.HTTPSConnection, req,
1398 context=self._context, check_hostname=self._check_hostname)
File /usr/lib/python3.8/urllib/request.py:1357, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1354 h.request(req.get_method(), req.selector, req.data, headers,
1355 encode_chunked=req.has_header('Transfer-encoding'))
1356 except OSError as err: # timeout error
-> 1357 raise URLError(err)
1358 r = h.getresponse()
1359 except:
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)>
This has been pulled into the development branch. If you get a chance to test before I do (likely since I don't have linux ready to go to test), let me know how it works out.
Several attempts to download files from the EIA.gov website have failed due to a verification failure of SSL certificates on Linux OS running Python 3.8.10. I'm not able to see a clear solution on the user's end, but I'm happy to try any recommended methods.
In looking through the electricityLCI code, I noticed that one of your required packages is
requests
; however, in eia_trans_dist_grid_loss.py, the native Python urllib.request module is being used.Would it be possible to implement one of the two following solutions in the event that a URLError or SSLCertVerificationError is caught during the try-except block of
eia_trans_dist_download_extract
?Option 1
You can maintain the use of urllib.requests by also importing ssl and creating a context that by-passes this SLL verification (see example code below).
Option 2
Use the
requests
package that is already a dependency of electricityLCI. Note that I didn't receive any errors when using it.Error traceback
Tested with:
and received these error statements: