USEPA / ElectricityLCI

Creative Commons Zero v1.0 Universal
24 stars 10 forks source link

Manual edits #212

Open dt-woods opened 7 months ago

dt-woods commented 7 months ago

The methods in manual_edits.py seem to be failing. The simple reproducible code is to try running create_generation_process_df in generation.py

>>> import logging
>>> import electricitylci.model_config as config
>>> root_logger = logging.getLogger()
>>> root_handler = logging.StreamHandler()
>>> rec_format = (
...     "%(asctime)s.%(msecs)03d:%(levelname)s:%(module)s:%(funcName)s:"
...     "%(message)s")
>>> formatter = logging.Formatter(rec_format, datefmt='%Y-%m-%d %H:%M:%S')
>>> root_handler.setFormatter(formatter)
>>> root_logger.addHandler(root_handler)
>>> root_logger.setLevel("DEBUG")
>>> config.model_specs = config.build_model_class()
Select a model number to use:
    1: ELCI_1
    2: ELCI_2
    3: ELCI_3
1
>>> from electricitylci.generation import create_generation_process_df
>>> df = create_generation_process_df()

The warning messages include the following:

2023-11-29 14:50:31.173:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.173:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 14:50:31.173:DEBUG:manual_edits:reassign:FuelCategory
2023-11-29 14:50:31.173:DEBUG:manual_edits:reassign:SOLAR
2023-11-29 14:50:31.173:DEBUG:manual_edits:reassign:GAS
2023-11-29 14:50:31.177:DEBUG:manual_edits:reassign:Filters for FacilityID are [56938]
2023-11-29 14:50:31.177:WARNING:manual_edits:reassign:Problem found with manual edit - reassign
2023-11-29 14:50:31.177:WARNING:manual_edits:reassign:'FacilityID'
2023-11-29 14:50:31.177:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.177:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 14:50:31.177:DEBUG:manual_edits:reassign:FuelCategory
2023-11-29 14:50:31.177:DEBUG:manual_edits:reassign:SOLAR
2023-11-29 14:50:31.177:DEBUG:manual_edits:reassign:GAS
2023-11-29 14:50:31.182:DEBUG:manual_edits:reassign:Filters for FacilityID are [58697]
2023-11-29 14:50:31.182:WARNING:manual_edits:reassign:Problem found with manual edit - reassign
2023-11-29 14:50:31.182:WARNING:manual_edits:reassign:'FacilityID'
2023-11-29 14:50:31.182:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.182:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 14:50:31.182:DEBUG:manual_edits:reassign:eGRID_ID
2023-11-29 14:50:31.182:DEBUG:manual_edits:reassign:56938
2023-11-29 14:50:31.182:DEBUG:manual_edits:reassign:58697
2023-11-29 14:50:31.182:DEBUG:manual_edits:reassign:Filters for Source are ['NEI']
2023-11-29 14:50:31.184:DEBUG:manual_edits:reassign:Filters for Year are [2016]
2023-11-29 14:50:31.186:DEBUG:manual_edits:reassign:Reassigning 0 rows
2023-11-29 14:50:31.186:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.186:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 14:50:31.186:DEBUG:manual_edits:reassign:eGRID_ID
2023-11-29 14:50:31.186:DEBUG:manual_edits:reassign:56944
2023-11-29 14:50:31.186:DEBUG:manual_edits:reassign:55077
2023-11-29 14:50:31.186:DEBUG:manual_edits:reassign:Filters for Source are ['NEI', 'eGRID', 'RCRA', 'TRI']
2023-11-29 14:50:31.189:DEBUG:manual_edits:reassign:Filters for Year are [2016, 2015]
2023-11-29 14:50:31.191:DEBUG:manual_edits:reassign:Reassigning 0 rows
2023-11-29 14:50:31.191:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.191:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 14:50:31.191:DEBUG:manual_edits:reassign:FacilityID
2023-11-29 14:50:31.191:DEBUG:manual_edits:reassign:56938
2023-11-29 14:50:31.191:DEBUG:manual_edits:reassign:58697
2023-11-29 14:50:31.191:WARNING:manual_edits:reassign:Problem found with manual edit - reassign
2023-11-29 14:50:31.191:WARNING:manual_edits:reassign:'FacilityID'
2023-11-29 14:50:31.192:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.192:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 14:50:31.192:DEBUG:manual_edits:reassign:FacilityID
2023-11-29 14:50:31.192:DEBUG:manual_edits:reassign:56944
2023-11-29 14:50:31.192:DEBUG:manual_edits:reassign:55077
2023-11-29 14:50:31.192:WARNING:manual_edits:reassign:Problem found with manual edit - reassign
2023-11-29 14:50:31.192:WARNING:manual_edits:reassign:'FacilityID'
2023-11-29 14:50:31.192:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 14:50:31.192:INFO:manual_edits:remove:Removing using data from yaml
2023-11-29 14:50:31.192:DEBUG:manual_edits:remove:Filters for FacilityID are [60880]
2023-11-29 14:50:31.192:WARNING:manual_edits:remove:Problem found with manual edit - remove
2023-11-29 14:50:31.192:WARNING:manual_edits:remove:'FacilityID'

A quick investigation of the final_database data frame before the method call shows that the 'FacilityID' column is renamed to 'eGRID_ID' and 'Year' column values are object (strings), not integers. Therefore, no manual edits are actually applied.

dt-woods commented 7 months ago

Here's a potential fix:

generation.py:
    create_generation_process_df:
        entry_1:
          edit_type: "reassign"
          data_source: "yaml"
          column_to_reassign: "FuelCategory"
          incoming_value: "SOLAR"
          outgoing_value: "GAS"
          filters:
              eGRID_ID:
                - 56938
              Source:
                - "NEI"
              Year:
                - "2016"
        entry_2:
          edit_type: "reassign"
          data_source: "yaml"
          column_to_reassign: "FuelCategory"
          incoming_value: "SOLAR"
          outgoing_value: "GAS"
          filters:
              eGRID_ID:
                  - 58697
              Source:
                - "NEI"
                - "eGRID"
                - "RCRA"
                - "TRI"
              Year:
                  - "2016"
        entry_3:
          edit_type: "reassign"
          data_source: "yaml"
          column_to_reassign: "eGRID_ID"
          incoming_value: 56938
          outgoing_value: 58697
          filters:
              Source:
                - "NEI"
              Year:
                - "2016"
        entry_4:
          edit_type: "reassign"
          data_source: "yaml"
          column_to_reassign: "eGRID_ID"
          incoming_value: 56944
          outgoing_value: 55077
          filters:
              Source:
                - "NEI"
                - "eGRID"
                - "RCRA"
                - "TRI"
              Year:
                - "2016"
                - "2015"
        entry_5:
          edit_type: "reassign"
          data_source: "yaml"
          column_to_reassign: "eGRID_ID"
          incoming_value: 56938
          outgoing_value: 58697
          filters:
              Source:
                - "NEI"
              Year:
                - "2016"
        entry_6:
          edit_type: "reassign"
          data_source: "yaml"
          column_to_reassign: "eGRID_ID"
          incoming_value: 56944
          outgoing_value: 55077
          filters:
              Source:
                - "NEI"
                - "eGRID"
                - "RCRA"
                - "TRI"
              Year:
                - "2016"
                - "2015"
        entry_7:
          edit_type: "remove"
          data_source: "yaml"
          filters:
            eGRID_ID:
              - 60880
            Year:
              - "2016"
dt-woods commented 7 months ago

Even after the correction, only 4/7 manual edits are actually changing anything (see below where 0 rows are edited).

2023-11-29 16:30:10.930:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.930:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 16:30:10.940:INFO:manual_edits:reassign:Reassigning 25 rows
2023-11-29 16:30:10.940:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.940:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 16:30:10.950:INFO:manual_edits:reassign:Reassigning 0 rows
2023-11-29 16:30:10.950:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.950:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 16:30:10.955:INFO:manual_edits:reassign:Reassigning 25 rows
2023-11-29 16:30:10.955:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.955:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 16:30:10.961:INFO:manual_edits:reassign:Reassigning 15 rows
2023-11-29 16:30:10.961:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.961:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 16:30:10.966:INFO:manual_edits:reassign:Reassigning 0 rows
2023-11-29 16:30:10.966:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.966:INFO:manual_edits:reassign:Re-assigning using data from yaml
2023-11-29 16:30:10.971:INFO:manual_edits:reassign:Reassigning 0 rows
2023-11-29 16:30:10.971:INFO:manual_edits:check_for_edits:Edits found for generation.py.create_generation_process_df
2023-11-29 16:30:10.971:INFO:manual_edits:remove:Removing using data from yaml
2023-11-29 16:30:10.974:INFO:manual_edits:remove:Removing 1 rows
m-jamieson commented 7 months ago

To be fair, the idea, whether it was correctly implemented or not, was the "Year" field would limit which years these replacement were applied to. If this is happening while generating 2020 results, I would expect things not to work. If this is happening using an existing configuration for 2016, then maybe the times when nothing is changed reflects that the source data was fixed.