Update inventories | Coal

dt-woods commented 3 months ago

The final report is now available (see link below). Update the inventory and metadata in eLCI.

https://netl.doe.gov/energy-analysis/details?id=34eb4fb9-12ef-49b3-977d-895fbf16f437

In the past, the openLCA version of the model was used to generate Monte-Carlo results inventory and median values extracted for eLCI.

IN PROGRESS

[x] Generate new coal mining LCI
[x] Read new coal mining LCI into eLCI (on its own branch)
[x] Generate new coal mining transportation LCI
[ ] Update transportation data
[ ] Generate new coal power plant construction LCI

dt-woods commented 2 months ago

The openLCA model that accompanies the report is also available on NETL's website.

https://www.netl.doe.gov/energy-analysis/details?id=0c0dde04-0d3c-4c7f-bd33-2745e061b8e0

dt-woods commented 2 months ago

The 22 mining scenarios (coal basin - coal type - mine type) were each calculated using the default parameters as designated in their respective product system, with functional unit set to 1 kg of coal processed at mine, and the lazy/on-demand calculation method with the TRACI 2.1 (NETL) AR6 (100 yr) impact assessment method in order to generate the results inventories. Results were exported to Excel (using the Export to Excel button).

The input and output flows were read from the "Inventory" worksheet from each of the 22 Excel workbooks, and were combined into a single CSV.

dt-woods commented 2 months ago

# Purpose: Generate the coal_mining_lci.csv from openLCA results exports.

# Required packages
import glob
import os
import re

import pandas as pd

# Set folder path and find Excel workbooks
working_folder = "coal-results_2024-08"
file_name_list = glob.glob(os.path.join(working_folder, "*_*_*.xlsx"))
if len(file_name_list) == 0:
    print("no files found!")
else:
    print("found %d files" % len(file_name_list))

# Regular expression for product flow types (based on the category name)
p = re.compile("^[\dIC].*")

# List that will store the data frame for each file
dfs = []

# Iterate over each workbook
for file_name in file_name_list:
    # Extract the coal code from the file name
    scenario = os.path.splitext(os.path.basename(file_name))[0]
    coal_code = scenario.replace("_","-")

    # Lists that will store the input and output of each file
    results = []
    code = []
    uuid = []
    cat = []
    subcat = []
    compartment = []
    fname = []
    ftype = []
    unit = []

    # Read Excel sheet for the inputs of the coal inventory
    df_inputs = pd.read_excel(file_name, "Inventory", usecols="B:G", header=2)

    # Extract the flow UUID values from column B and add it to a list
    for index, row in df_inputs.iterrows():
        ID_values = row["Flow UUID"]
        if not pd.isna(ID_values):
            uuid.append(ID_values)

    # Extract the flow names from column C and add it to a list
    for index, row in df_inputs.iterrows():
        ID_values = row["Flow"]
        if not pd.isna(ID_values):
            fname.append(ID_values)

    # Extract the category values from column D and add it to a list
    for index, row in df_inputs.iterrows():
        ID_values = row["Category"]
        if not pd.isna(ID_values):
            cat.append(ID_values)

    # Extract the sub-category values from column E and add it to a list
    for index, row in df_inputs.iterrows():
        ID_values = row["Sub-category"]
        if not pd.isna(ID_values):
            subcat.append(ID_values)

    # Extract the unit values from column F and add it to a list
    for index, row in df_inputs.iterrows():
        ID_values = row["Unit"]
        if not pd.isna(ID_values):
            unit.append(ID_values)

    # Extract the result values from column G and add it to a list
    for index, row in df_inputs.iterrows():
        ID_values = row["Result"]
        if not pd.isna(ID_values):
            results.append(ID_values)

    num_input = len(uuid)

    # Reads excel sheet for the outputs of the coal inventory
    df_outputs = pd.read_excel(file_name, "Inventory", usecols="I:N", header=2)

    # Extract the flow UUID values from column I and add it to the previous flow UUID list
    for index, row in df_outputs.iterrows():
        ID_values = row["Flow UUID.1"]
        if not pd.isna(ID_values):
            uuid.append(ID_values)

    # Extract the flow values from column J and add it to the previous flow list
    for index, row in df_outputs.iterrows():
        ID_values = row["Flow.1"]
        if not pd.isna(ID_values):
            fname.append(ID_values)

    # Extract the category values from column K and add it to the previous category list
    for index, row in df_outputs.iterrows():
        ID_values = row["Category.1"]
        if not pd.isna(ID_values):
            cat.append(ID_values)

    # Extract the sub-category values from column L and add it to the previous list
    for index, row in df_outputs.iterrows():
        ID_values = row["Sub-category.1"]
        if not pd.isna(ID_values):
            subcat.append(ID_values)

    # Extract the unit values from column M and add it to the previous unit list
    for index, row in df_outputs.iterrows():
        ID_values = row["Unit.1"]
        if not pd.isna(ID_values):
            unit.append(ID_values)

    # Extract the result values from column N and add it to the previous result list
    for index, row in df_outputs.iterrows():
        ID_values = row["Result.1"]
        if not pd.isna(ID_values):
            results.append(ID_values)

    # Combine the catergory and sub-category
    for path in zip(cat, subcat):
        cp = "/".join(path)
        compartment.append(cp)

    # Determine flow type from name and category.
    # ELEMENTARY_FLOW
    # - resource/
    # - emission/
    # - ground/
    # - air/
    # - human-dominated/
    # - Elementary Flows/
    ftype = ["ELEMENTARY_FLOW" for i in range(len(cat))]

    # PRODUCT_FLOW ([\dIC].*)
    # - 31-*
    # - 56*
    # - Crude oil *
    # - Intermediate *
    p_idx = [i for i in range(len(cat)) if p.match(cat[i])]
    for _idx in p_idx:
        ftype[_idx] = "PRODUCT_FLOW"

    # WASTE_FLOW
    # - "Waste, solid" in 56: *
    w_idx = [i for i in range(len(fname)) if fname[i] == 'Waste, solid']
    for _idx in w_idx:
        ftype[_idx] = "WASTE_FLOW"

    # Create a new data frame file
    new_df = pd.DataFrame(
        columns=[
            'Results',
            'Coal Code',
            'FlowUUID',
            'Compartment',
            'FlowName',
            'FlowType',
            'Unit',
            'input']
    )
    new_df['Results'] = results
    new_df.loc[:, 'Coal Code'] = coal_code
    new_df['FlowUUID'] = uuid
    new_df['Compartment'] = compartment
    new_df['FlowName'] = fname
    new_df['FlowType'] = ftype
    new_df['Unit'] = unit
    new_df.loc[0:num_input, 'input'] = 'TRUE'
    new_df.loc[num_input:, 'input'] = 'FALSE'

    # Add the new data frame into master list
    dfs.append(new_df)

# Concatenate data frames together and write to CSV
LCI_df = pd.concat(dfs)
LCI_df.to_csv('coal_mining_lci_2024.csv', index=False)

dt-woods commented 1 month ago

The coal basin inventories are publicly available:

https://www.netl.doe.gov/energy-analysis/details?id=0f290eed-5e4b-4b5f-bec0-36b0ea89c6e5

The coal transportation inventories are publicly available:

https://www.netl.doe.gov/energy-analysis/details?id=27ea1ba4-6ea9-4ee5-8b32-d7fce7f4e1e0
- The 'Information' sheet includes transportation distance between coal basins and NERC region
- There are five modes of transportation (conveyor belt, truck, barge, ocean vessel, and train) with input/output flow tables in kg/kg-km.

In the current inventory (Coal_model_transportation_inventory.xlsx), there are two worksheets: "transportation" and "flowmapping." In both worksheets, 41 air emissions are examined. In the former worksheet, alternative flow names are listed as column headers (e.g., PM2.5 for 'Particulate matter, ≤2.5 µm') and the latter includes a mapping align with UUIDs and compartments.

There is a path forward where the 41 flows from the transportation inventory are queried against the new inventory workbook to update their values. Note the units are different (i.e., ton*miles are now kg*km), which will require careful attention.
The eLCI documentation suggests that coal transportation distances, which are based on static 2016 data (from ABB Velocity Suite), are unlikely to change unless a facility begins to receive coal of a different type or from a different type of mine, or from a different basin.

EIA923 Schedules 2–5, Page 5 includes mine name and mine state, which could be used to query its exact location (e.g., https://www.gem.wiki/Caballo_Coal_Mine, or imported from overseas).

dt-woods commented 1 week ago

Comparing the 2020 to the 2023 coal mining inventories.

>>> COAL_MINING_LCI_VINTAGE = 2023
>>> coal_2023 = read_coal_mining()
>>> COAL_MINING_LCI_VINTAGE = 2020
>>> coal_2020 = read_coal_mining()
>>> coal_2020 = coal_2020.drop(columns=['Mean', 'p05', 'p10', 'p2.5', 'p90', 'p95', 'p97.5'])
>>> coal_2023 = coal_2023.rename(columns={'Results': 'Results23'})
>>> coal_2020 = coal_2020.rename(columns={'Results': 'Results20'})
>>> coal_merge = pd.merge(left=coal_2023, right=coal_2020, on=['Coal Code', 'Compartment', 'FlowUUID', 'input'], how='inner')
>>> coal_merge['diffPCT'] = coal_merge['Results20']/coal_merge["Results23"]
>>> coal_merge.describe() 
          Results23     Results20          diff       diffPCT
count  4.624400e+04  4.624400e+04  4.624400e+04  46244.000000
mean   1.507712e+08  8.357991e+07  6.719128e+07     16.176887
std    1.327405e+10  5.612758e+09  8.976665e+09     85.728028
min   -5.032724e-08 -5.040000e-08 -3.180356e+07    -54.396335
25%    4.936494e-16  4.730000e-16  1.127417e-19      0.972069
50%    2.507873e-14  2.720000e-14  2.809026e-17      0.993597
75%    8.170908e-12  1.070000e-11  7.172098e-15      0.997570
max    1.891626e+12  5.950000e+11  1.355626e+12   1436.650658
>>> coal_merge.query("diffPCT < -5")[['Results20', 'Results23', 'diffPCT', 'FlowName_y', 'Compartment']]
          Results20     Results23    diffPCT FlowName_y      Compartment
6404   4.590000e-11 -8.438068e-13 -54.396335    Silicon  resource/ground
16914  9.750000e-13 -1.521012e-13  -6.410204    Silicon  resource/ground
23220  5.330000e-12 -9.804868e-14 -54.360754    Silicon  resource/ground
25322  9.750000e-13 -1.521012e-13  -6.410204    Silicon  resource/ground
27424  5.330000e-12 -9.804868e-14 -54.360754    Silicon  resource/ground
31628  9.750000e-13 -1.521012e-13  -6.410204    Silicon  resource/ground
37934  8.390000e-12 -1.308983e-12  -6.409555    Silicon  resource/ground
40036  4.590000e-11 -8.438068e-13 -54.396335    Silicon  resource/ground
>>> coal_merge.query("diffPCT > 900")[['Results20', 'Results23', 'diffPCT', 'FlowName_y', 'Compartment']]
          Results20     Results23      diffPCT                                  FlowName_y     Compartment
2458   1.890000e-14  1.697044e-17  1113.701002       ACETALDEHYDE, TRICHLORO- (OR) CHLORAL  emission/waste
3369   2.370000e-16  2.128037e-19  1113.702677                                    M-CRESOL  emission/waste
3450   7.710000e-14  6.919674e-17  1114.214411  METHANE, DICHLORO- (OR) METHYLENE CHLORIDE  emission/waste
3722   1.130000e-15  1.012406e-18  1116.153325                                      PHENOL  emission/waste
6267   5.651222e+02  3.933609e-01  1436.650658                                       Water  emission/water
12968  2.320000e-15  2.084817e-18  1112.807379       ACETALDEHYDE, TRICHLORO- (OR) CHLORAL  emission/waste
13879  2.910000e-17  2.614291e-20  1113.112635                                    M-CRESOL  emission/waste
13960  9.480000e-15  8.500812e-18  1115.187611  METHANE, DICHLORO- (OR) METHYLENE CHLORIDE  emission/waste
14232  1.380000e-16  1.243739e-19  1109.557225                                      PHENOL  emission/waste

Silicon resource has strangely become a negative resource in the 2023 and the biggest change is water (emission/water), which went from hundreds down to tenths. I believe that water-to-water emissions in 2023 are more refined than in 2020 (i.e., not just lumped under 'Water', but found in 'Water, fresh', 'Water, saline', 'Water, brackish', and 'Water, reclaimed'), which would account for the large difference in flows. Otherwise, differences are not too dramatic between inventories.

dt-woods commented 1 week ago

Having trouble determining why we need this unit conversion:

https://github.com/USEPA/ElectricityLCI/blob/ab4e98dca965870035dba6ac7a04b3fd66be29fd/electricitylci/coal_upstream.py#L761

The coal mining inventory results are based on the functional unit "per short ton of coal, processes, at mine" and the quantity (from EIA coal receipts) is in units of short tons. This equation is then (units/short ton) x (short tons) = (units). So why convert tons to kg?

m-jamieson commented 1 week ago

The inventory results at least in the previous version were per kg of coal, so the conversion is needed to convert the short tons to kg to arrive at kg emission/yr. If the new inventory is per short ton, then this conversion could go away. It's also worth noting that the unit processes that are created end up being kg emission/short ton, which might be the source of confusion.

dt-woods commented 1 week ago

I examined the openLCA model again and found that the product systems that were run to generate the results inventories was set to 1 kg of coal processed at the mine. This matches what you said and solves the mystery of the unit conversion. I'll update the notes accordingly.

USEPA / ElectricityLCI

Update inventories | Coal #243