NREL / foundational-industry-energy-data

The Foundational Industry Energy Dataset (FIED) is a unit-level characterization of energy use in the U.S. industrial sector.
https://nrel.github.io/foundational-industry-energy-data/
2 stars 0 forks source link

Debugging and updates #4

Closed calmc closed 3 months ago

calmc commented 1 year ago

Addressed bug in downloading and unzipping GHGRP unit data. Also fixed a few other problems associated with reconciling energy estimates from GHGRP and NEI data. All of the calculations and dataset assembly specified in fied_compilation.py should now run, given the environment and manually-downloaded data specified in the repo's README.

@dthierry are you able to review this and compare what changes you made in your branch? Or, maybe switch to this branch and try to run fied_compilation.py with the manually-downloaded datasets to see if there are any errors?

dthierry commented 12 months ago

@calmc I tried the manual download with this branch but I've experienced issues with the FRS part and then the GHGRP.


INFO:root:Combined zip file does not exist. Downloading...
INFO:root:Combined file unzipped.
Traceback (most recent call last):
  File "/Users/dthierry/Projects/cm_debug_test/fied_compilation.py", line 1315, in <module>
    frs_data = pd.read_csv(
               ^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 948, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 611, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1448, in __init__
    self._engine = self._make_engine(f, self.engine)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1705, in _make_engine
    self.handles = get_handle(
                   ^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/io/common.py", line 863, in get_handle
    handle = open(
             ^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './data/FRS/frs_data_formatted.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3790, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 158, in pandas._libs.index.IndexEngine.get_loc
TypeError: 'Index([51], dtype='int64')' is an invalid key

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/frame.py", line 4338, in _set_value
    iindex = self.index.get_loc(index)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    self._check_indexing_error(key)
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 5974, in _check_indexing_error
    raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: Index([51], dtype='int64')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/dthierry/Projects/cm_debug_test/fied_compilation.py", line 1323, in <module>
    frs_data = frs_methods.import_format_frs(combined=True)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/frs/frs_extraction.py", line 503, in import_format_frs
    pgm_data = self.read_frs_csv(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/frs/frs_extraction.py", line 378, in read_frs_csv
    data = self.format_program_csv(data, programs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/frs/frs_extraction.py", line 255, in format_program_csv
    data_dict[a].at[use_index, f'PGM_SYS_ID_{a}_additional'] =\
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/indexing.py", line 2499, in __setitem__
    return super().__setitem__(key, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/indexing.py", line 2455, in __setitem__
    self.obj._set_value(*key, value=value, takeable=self._takeable)
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/frame.py", line 4357, in _set_value
    raise InvalidIndexError(
pandas.errors.InvalidIndexError: You can only assign a scalar value not a <class 'str'> ```
calmc commented 12 months ago

Thanks! Looks like there's still an issue with formatting downloaded FRS data. Could you also add the error you're getting for the GHGRP data?

dthierry commented 12 months ago

I think it is a problem downloading a file from the epa website.


/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'furnace' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'kiln' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'dryer' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'oven' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'calciner' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'stove' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'htr' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'furn' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'cupola' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'boiler' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'turbine' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'building heat' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'space heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'engine' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'compressor' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'pump' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'rice' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'generator' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'hot water' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'crane' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'water heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'comfort heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:363: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'oxidizer' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:398: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'boiler' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  ocs_units.loc[i, 'unit_type_iden'] = \
/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py:406: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler']' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.
  mult_types.unit_type_iden.update(
403 Client Error: Forbidden for url: https://www.epa.gov/system/files/other-files/2022-10/emissions_by_unit_and_fuel_type_c_d_aa_10_2022.zip
Try downloading zipfile from https://www.epa.gov/system/files/other-files/2022-10/emissions_by_unit_and_fuel_type_c_d_aa_10_2022.zip and saving to /Users/dthierry/Projects/ftest/data/GHGRP
Traceback (most recent call last):
  File "/Users/dthierry/Projects/ftest/fied_compilation.py", line 1331, in <module>
    ghgrp_unit_data = GHGRP_unit_char(ghgrp_energy_file, year).main()  # format ghgrp energy calculations to fit frs_json schema
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py", line 425, in main
    ghgrp_df = self.get_unit_capacity(ghgrp_df)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py", line 175, in get_unit_capacity
    unit_data_file_path = self.download_unit_data()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/ftest/ghgrp/ghgrp_fac_unit.py", line 147, in download_unit_data
    with zipfile.ZipFile(BytesIO(r.content)) as zf:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/zipfile.py", line 1302, in __init__
    self._RealGetContents()
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/zipfile.py", line 1369, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
calmc commented 12 months ago

Thanks, @dthierry! I think most of these bugs can be linked to downloading these datasets. For the GHGRP zip/xlsb file, the error handling I added mentions to try downloading manually and saving the zip file. I deleted my local version, manually downloaded it, and the code was able to unzip and run everything with errors. Related to the FRS data, I was getting a different error (the error handling for your FileNotFoundError should call the methods that create that missing csv file) that didn't make sense to me. It would appear only when I deleted all of my FRS data files and ran fied_compilation.py for the first time. That would download the FRS data and create the csv, but none of the subsequent code would run. If I ran fied_compilation.py again, I wouldn't get the error and the final data set would be created. I ended up just running frs_extraction.py first and then fied_compilation.py, which seems to avoid the errors. I'm going to commit these changes, as well as updates to the readme.

calmc commented 11 months ago

@dthierry did these changes (including running the code in the 2-step process described now in the readme) work for you?

dthierry commented 11 months ago

Hi Colin, I've tried the second step and it fails, probably because of Pandas. This is the error:

INFO:root:Reading NEI data from zipfiles; writing nei_ind_data.csv
Traceback (most recent call last):
  File "/Users/dthierry/Projects/cm_debug_test/nei/nei_EF_calculations.py", line 1338, in <module>
    nei_char = NEI().main()
               ^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/nei/nei_EF_calculations.py", line 1301, in main
    nei_data = nei.load_nei_data()
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/nei/nei_EF_calculations.py", line 418, in load_nei_data
    nei_data = nei_data.append(data, sort=False)
               ^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/generic.py", line 6204, in __getattr__
    return object.__getattribute__(self, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

I think that in order for the nei_EF_calculations.py to work, one must run it from the root directory unless I am wrong. Also, for reference this is my pandas version:

pandas                    2.1.4           py311h6e08293_0    conda-forge

I've also tried the 3rd step, and coping both the zip and the unzipped file into the GHGRP folder (both separately and simultaneously), and this is the error message:

INFO:root:There are 5004 units or 45.9% labelled as OCS
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'furnace' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'kiln' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'dryer' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'oven' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'calciner' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'stove' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'htr' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'furn' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'cupola' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'boiler' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'turbine' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'building heat' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'space heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'engine' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'compressor' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'pump' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'rice' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'generator' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'hot water' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'crane' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'water heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'comfort heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'oxidizer' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  named_units[c].fillna(c, inplace=True)
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:362: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'boiler' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
  ocs_units.loc[i, 'unit_type_iden'] = \
/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:370: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'
 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler']' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.
  mult_types.unit_type_iden.update(
Traceback (most recent call last):
  File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 405, in <module>
    ghgrp_df = GHGRP_unit_char(ghgrp_energy_file, reporting_year).main()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 389, in main
    ghgrp_df = self.get_unit_capacity(ghgrp_df)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 150, in get_unit_capacity
    unit_data_file_path = self.download_unit_data()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 119, in download_unit_data
    with zipfile.ZipFile(BytesIO(r.content)) as zf:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/zipfile.py", line 1302, in __init__
    self._RealGetContents()
  File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/zipfile.py", line 1369, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
calmc commented 11 months ago

Thanks! Hmmm... did you create an environment using the .yaml in the repo? If not, what version of pandas are you using?

From: Thierry D @.> Sent: Tuesday, December 12, 2023 3:02 PM To: NREL/foundational-industry-energy-data @.> Cc: McMillan, Colin @.>; Mention @.> Subject: Re: [NREL/foundational-industry-energy-data] Debugging and updates (PR #4)

CAUTION: This email originated from outside of NREL. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi Colin, I've tried the second step and it fails, probably because of Pandas. This is the error:

INFO:root:Reading NEI data from zipfiles; writing nei_ind_data.csv

Traceback (most recent call last):

File "/Users/dthierry/Projects/cm_debug_test/nei/nei_EF_calculations.py", line 1338, in

nei_char = NEI().main()

           ^^^^^^^^^^^^

File "/Users/dthierry/Projects/cm_debug_test/nei/nei_EF_calculations.py", line 1301, in main

nei_data = nei.load_nei_data()

           ^^^^^^^^^^^^^^^^^^^

File "/Users/dthierry/Projects/cm_debug_test/nei/nei_EF_calculations.py", line 418, in load_nei_data

nei_data = nei_data.append(data, sort=False)

           ^^^^^^^^^^^^^^^

File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/site-packages/pandas/core/generic.py", line 6204, in getattr

return object.__getattribute__(self, name)

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

I think that in order for the nei_EF_calculations.py to work, one must run it from the root directory unless I am wrong. Also, for reference this is my pandas version:

pandas 2.1.4 py311h6e08293_0 conda-forge

I've also tried the 3rd step, and coping both the zip and the unzipped file into the GHGRP folder (both separately and simultaneously), and this is the error message:

INFO:root:There are 5004 units or 45.9% labelled as OCS

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'furnace' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'kiln' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'dryer' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'oven' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'calciner' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'stove' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'htr' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'furn' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'cupola' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'boiler' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'turbine' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'building heat' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'space heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'engine' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'compressor' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'pump' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'rice' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'generator' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'hot water' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'crane' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'water heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'comfort heater' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:327: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'oxidizer' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

named_units[c].fillna(c, inplace=True)

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:362: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'boiler' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

ocs_units.loc[i, 'unit_type_iden'] = \

/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py:370: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler'

'boiler' 'boiler' 'boiler' 'boiler' 'boiler' 'boiler']' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.

mult_types.unit_type_iden.update(

Traceback (most recent call last):

File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 405, in

ghgrp_df = GHGRP_unit_char(ghgrp_energy_file, reporting_year).main()

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 389, in main

ghgrp_df = self.get_unit_capacity(ghgrp_df)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 150, in get_unit_capacity

unit_data_file_path = self.download_unit_data()

                      ^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/dthierry/Projects/cm_debug_test/ghgrp/ghgrp_fac_unit.py", line 119, in download_unit_data

with zipfile.ZipFile(BytesIO(r.content)) as zf:

     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/zipfile.py", line 1302, in init

self._RealGetContents()

File "/Users/dthierry/mambaforge/envs/found/lib/python3.11/zipfile.py", line 1369, in _RealGetContents

raise BadZipFile("File is not a zip file")

zipfile.BadZipFile: File is not a zip file

- Reply to this email directly, view it on GitHubhttps://github.com/NREL/foundational-industry-energy-data/pull/4#issuecomment-1852719433, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEJMJRMIJL2DXNRVOHX4W3LYJCZ3ZAVCNFSM6AAAAAA7MXDFRGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJSG4YTSNBTGM. You are receiving this because you were mentioned.Message ID: @.**@.>>

dthierry commented 11 months ago

Hi Colin,

I was using 2.1.4, so I tried downgrading my pandas version. The earliest version I could install in my operating system is 1.1.3 , alongside python 3.9.18. After doing that I'd get the following message:

INFO:root:Getting NEI data...
INFO:root:Reading NEI data from csv
Traceback (most recent call last):
  File "/Users/dthierry/Projects/cm_debug_test/./nei/nei_EF_calculations.py", line 1339, in <module>
    nei_char = NEI().main()
  File "/Users/dthierry/Projects/cm_debug_test/./nei/nei_EF_calculations.py", line 1302, in main
    nei_data = nei.load_nei_data()
  File "/Users/dthierry/Projects/cm_debug_test/./nei/nei_EF_calculations.py", line 365, in load_nei_data
    nei_data = pd.read_csv(self._nei_data_path, low_memory=False,
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 686, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 458, in _read
    data = parser.read(nrows)
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 1196, in read
    ret = self._engine.read(nrows)
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 2231, in read
    index, names = self._make_index(data, alldata, names)
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 1677, in _make_index
    index = self._agg_index(index)
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 1770, in _agg_index
    arr, _ = self._infer_types(arr, col_na_values | col_na_fvalues)
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/io/parsers.py", line 1871, in _infer_types
    mask = algorithms.isin(values, list(na_values))
  File "/Users/dthierry/mambaforge/envs/f2/lib/python3.9/site-packages/pandas/core/algorithms.py", line 443, in isin
    if np.isnan(values).any():
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
calmc commented 11 months ago

@dthierry I've made a lot more changes, including updating the environment to match those earliest versions of python and pandas that work on your machine. So, you'll need to create a new environment using fied_environment.yml. When you have the chance, could you please pull these updates and try running the code within the new fied environment? Everything is now working locally for me (code should also run faster now that it's not using an API to find census blocks).

dthierry commented 10 months ago

Hi Colin, I've tried using the instructions regarding the yaml file, and it seems that most the packages are missing. For reference I am running this on conda conda 23.11.0 on a M1 macOS computer. Thus, I have used slightly different versions of some of your main dependencies. For reference, these are the following:

python                    3.9.18               hb885b13_0  
numpy                     1.26.2           py39h3b2db8e_0  
pandas                    1.3.1            py39h9197a36_0  
geopandas                 0.13.2             pyhd8ed1ab_1    conda-forge
openpyxl                  3.0.10           py39h1a28f6b_0  

Regarding the data set-up stage, I got stuck running the nei calculation scripts. After downloading and unzipping as instructed running the nei_EF_calculations.py gives me the following output:

INFO:root:Getting NEI data...
INFO:root:Reading NEI data from zipfiles
INFO:root:Downloading WebFire data; writing webfirefactors.csv
Traceback (most recent call last):
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/connectionpool.py", line 404, in _make_request
    self._validate_conn(conn)
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn
    conn.connect()
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/ssl.py", line 501, in wrap_socket
    return self.sslsocket_class._create(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/ssl.py", line 1074, in _create
    self.do_handshake()
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/ssl.py", line 1343, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='cfpub.epa.gov', port=443): Max retries exceeded with url: /webfire/download/webfirefactors.zip (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1129)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dthierry/Projects/foundational-industry-energy-data/nei/nei_EF_calculations.py", line 1335, in <module>
    nei_char = NEI().main()
  File "/Users/dthierry/Projects/foundational-industry-energy-data/nei/nei_EF_calculations.py", line 1301, in main
    webfr = nei.load_webfires()
  File "/Users/dthierry/Projects/foundational-industry-energy-data/nei/nei_EF_calculations.py", line 590, in load_webfires
    r = requests.get(
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/Users/dthierry/miniconda3/envs/fied/lib/python3.9/site-packages/requests/adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='cfpub.epa.gov', port=443): Max retries exceeded with url: /webfire/download/webfirefactors.zip (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1129)')))

Moving on to the ghgrp stage of the data, running the ghgrp_fac_unit.py gives me the following error output:

Traceback (most recent call last):
  File "/Users/dthierry/Projects/foundational-industry-energy-data/ghgrp/ghgrp_fac_unit.py", line 440, in <module>
    ghgrp_df = GHGRP_unit_char(ghgrp_energy_file, reporting_year).main()
  File "/Users/dthierry/Projects/foundational-industry-energy-data/ghgrp/ghgrp_fac_unit.py", line 50, in __init__
    self._data_schema = import_data_schema(self._data_source)
  File "/Users/dthierry/Projects/foundational-industry-energy-data/ghgrp/ghgrp_fac_unit.py", line 44, in import_data_schema
    with open('./nei/extracted_data_schema.json') as file:
FileNotFoundError: [Errno 2] No such file or directory: './nei/extracted_data_schema.json'

The compilation stage fails as well, but I think is because of the previous steps weren't successful. Please let me know what else must be done.