brightway-lca / bw_hybrid

Hybrid (Input-Output/Process-based) Life-Cycle Assessment
https://docs.brightway.dev/projects/hybrid/
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

`pylcaio.DatabaseLoader.combine_ecoinvent_exiobase` #9

Closed michaelweinold closed 5 months ago

michaelweinold commented 2 years ago

Documentation:

Refactoring:

michaelweinold commented 2 years ago

Pending discussion with @cmutel on best way to load ecoinvent data into the new hybridization module (considering also the Brightway strategic development plan).

cmutel commented 2 years ago

ecoinvent

Load ecoinvent from ecospold2 files using bw2io.SingleOutputEcospold2Importer. Needs bw2io.bw2setup to be run first.

Matrices

import bw2data as bd
import matrix_utils as mu

db = bd.Database('ecoinvent 3.8 cutoff')
mapped_technosphere_matrix = mu.MappedMatrix(packages=[db.datapackage()], matrix="technosphere_matrix")
raw_technosphere_matrix = mapped_technosphere_matrix.matrix

mapped_biosphere_matrix = mu.MappedMatrix(packages=[db.datapackage()], matrix="biosphere_matrix")

Flow metadata

import bw2data as bd
import matrix_utils as mu

df = bd.Database('biosphere3').nodes_to_dataframe()

Activity metadata

The generic nodes_to_dataframe method is missing a few things we need, e.g. prices. There is no price for activities, only for the reference products. Therefore, we can modify this function to get this additional information:

import bw2data as bd
import matrix_utils as mu
import pandas as pd
from typing import Optional, List

def extended_nodes_to_dataframe(
    database: bd.Database, columns: Optional[List[str]] = None, return_sorted: bool = True
) -> pd.DataFrame:
    """Modified function to get attributes of reference product"""
    def add_price(exc):
        try:
            return exc['properties']['price']['amount']
        except KeyError:
            return None

    activities = [node for node in database]

    for node in activities:
        rp = act.rp_exchange()
        node['price'] = add_price(rp)
        node['production volume'] = rp.get("production volume")

    # Code not modified below this line
    # ---------------------------------
    if columns is None:
        # Feels like magic
        df = pd.DataFrame(activities)
    else:
        df = pd.DataFrame(
            [{field: obj.get(field) for field in columns} for obj in self]
        )
    if return_sorted:
        sort_columns = ["name", "reference product", "location", "unit"]
        df = df.sort_values(
            by=[column for column in sort_columns if column in df.columns]
        )
    return df

df = extended_nodes_to_dataframe(bd.Database("<ecoinvent label>")