Error in reduced schedule CSV file

dt-woods commented 1 month ago

In the read_eia923_fuel_receipts method in coal_upstream.py, Page 5 of the EIA923 Excel workbook is saved to CSV. The header columns include a new line character "\n" for the following coal min columns:

Coalmine\nType
Coalmine\nState
Coalmine\nCounty
Coalmine\nMsha Id

When written to CSV, these headers are all truncated to "Coalmine," dropping the context after the newline. This results in a CSV file with four columns all of the same name and causes errors with merging.

https://github.com/USEPA/ElectricityLCI/blob/e56268132f7607ead58a33bb5bdd525563a784f5/electricitylci/coal_upstream.py#L90

To fix, consider running the data frame through the _clean_columns method before writing to CSV.

A symptom of this is a 'KeyError' on key 'fuel_group', accessed in generate_upstream_coal_map from the data frame returned by read_eia_fuel_receipts.

dt-woods commented 1 month ago

The worksheet:

And the reduced CSV:

dt-woods commented 1 month ago

Note: to implement this fix on your machine, you need to delete any CSV files in the f923_YEAR folders in your data directory.

USEPA / ElectricityLCI

Error in reduced schedule CSV file #259