USEPA / standardizedinventories

Standardized Release and Waste Inventories
MIT License
25 stars 16 forks source link

Server not compatible with RFC 5746 secure renegotiation #119

Closed vlahm closed 1 year ago

vlahm commented 1 year ago

I realize this isn't the ideal place to raise such an issue, but maybe you can pass it on to the relevant parties. Modern SSL clients expect servers to adhere to this proposed TLS standard which prevents a specific type of MitM attack. Attempting to use facilitymatcher with OpenSSL3 results in the following error:

SSLError: HTTPSConnectionPool(host='ofmext.epa.gov', port=443): Max retries exceeded with url: /FLA/www3/state_files/national_combined.zip (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:997)

I received this error by running facilitymatches = facilitymatcher.get_matches_for_inventories(["TRI"]) on Ubuntu 20.04, under Python 3.10.5. More information can be found here.

WesIngwersen commented 1 year ago

@vlahm The facilitymatcher repository and code needs to be archived. This work was incorporated into the facilitymatcher within StEWI https://github.com/USEPA/standardizedinventories/

The same function is present there and actually is failing for the same reason.

WesIngwersen commented 1 year ago

URL is specified here: https://github.com/USEPA/standardizedinventories/blob/cc026650f46fcd6a3c783b1d6798502fc92d2d40/facilitymatcher/config.yaml#L3

that is valid.

WesIngwersen commented 1 year ago

I'm getting a zipfile.BadZipFile: File is not a zip file error, likely because the request is not returning a valid result

a-w-beck commented 1 year ago

Hi @vlahm,

Upon importing facilitymatcher and executing the same call facilitymatches = facilitymatcher.get_matches_for_inventories(["TRI"]) I get the following output:

>>> import facilitymatcher
>>> facilitymatches = facilitymatcher.get_matches_for_inventories(["TRI"])
INFO FacilityMatchList_forStEWI not found in C:\Users\ABeck\AppData\Local\facilitymatcher, writing facility matches to file
INFO initiating url request from https://ordsext.epa.gov/FLA/www3/state_files/national_combined.zip
INFO extracting NATIONAL_ENVIRONMENTAL_INTEREST_FILE.CSV from https://ordsext.epa.gov/FLA/www3/state_files/national_combined.zip
INFO loading NATIONAL_ENVIRONMENTAL_INTEREST_FILE.CSV from C:\Users\ABeck\AppData\Local\facilitymatcher/FRS Data Files
INFO saving FacilityMatchList_forStEWI to C:\Users\ABeck\AppData\Local\facilitymatcher/

Calling facilitymatcher.WriteFacilityMatchesforStEWI.write_facility_matches() also successfully retrieved and stored the local facility files (after deleting the existing ones first).

I'm on a machine running Windows-10-10.0.19044-SP0, and I installed StEWI and its dependencies via conda using stewi_env.txt (Py-3.9.7), which is actually a YAML file (but GH doesn't support them as attachments). I also generated a Python 3.10.4 build (3.10.5 is not yet available on the conda default channel), specified in stewi_env_Py-3-10-4.txt, and got the same behavior.

Since downgrading cryptography to 36.0.2 is the accepted solution in the thread you linked, I also checked which packages depend on cryptography (37.0.1 in both of my envs). Both pyopenssl (used by urllib3) and urllib3 (used by requests and selenium) rely on cryptography. I'd suggest comparing the versions of these packages within your Python 3.10.5 environment to those pinned in the YAML files.

Please also feel free to share a conda env.yaml, requirements.txt, and/or other lock file that you used to create your environment.

a-w-beck commented 1 year ago

Also just found this thread, which I think sums up why it fails on Ubuntu 20.04.

@WesIngwersen probably worth asking the server admin if there's a path to incorporating the aforementioned TSL standard.

WesIngwersen commented 1 year ago

@a-w-beck Yes, please try to follow-up with them in email (cc: me)

bl-young commented 1 year ago

EPA alerted on 8/29.