US LCI database linking

aleksandra-kim commented 6 years ago

Original report by Shubhankar Upasani (Bitbucket: Shubh1995, GitHub: Shubh1995).

I have imported US LCI database downloaded from NREL into bw2 by following instructions mentioned [here](Linhttp://nbviewer.jupyter.org/urls/bitbucket.org/cmutel/brightway2/raw/default/notebooks/IO%20-%20Importing%20the%20US%20LCI%20database.ipynbk URL).

I have several unlinked exchanges. What should be the next step?

If I go ahead with the unlinked exchanges, how do I access the database object?

bw.databases only returns biosphere3 and does not return the US LCI database that I defined earlier.

aleksandra-kim commented 6 years ago

Original comment by Shubhankar Upasani (Bitbucket: Shubh1995, GitHub: Shubh1995).

I have the excel file that lists all the unlinked exchanges using

#!python

sp.write_excel(only_unlinked=True)

Can I perform a migration by reading from this excel file? Any suggestions how to?

aleksandra-kim commented 6 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).

I assume you were linking to the US LCI notebook. As you read at the end: "For every unmatched exchange, there is a reason the computer couldn't match it exactly. The next step is to figure out the problem for each exchange, and then write a migration to fix the input data to match what is expected."

A migration is a set of structured data that changes attributes based on filters. For example, "carbon dioxide" could be changed into "carbon dioxide, fossil" if the associated activity was a combustion activity. You can see examples of migrations here and here.

To link the US LCI completely, you will need to write migrations to change the provided values into ones that can be matched exactly by other biosphere flows or activity names/units/whatever. In other words, you will have to find the existing values, find the "correct values", and write a migration from one to the other. The excel file can help, but you will also have to search for the "correct" values. The other importing notebooks have specific examples of how to build and use migrations.

Importing with missing biosphere flows just means you will miss out on some (potentially important) biosphere flows; importing with missing links in activities isn't possible unless you delete activities willy-nilly, which is almost certainly not what you want.

aleksandra-kim commented 6 years ago

Original comment by Shubhankar Upasani (Bitbucket: Shubh1995, GitHub: Shubh1995).

Hi Chris,

Thanks for replying. Finding the "correct values" seems too tedious. For each exchange, I need to find an associated process in ecoinvent?

Can I change my default biosphere3 (or biosphere4) to contain US LCI biosphere flows? Would that be possible and make things easier?

aleksandra-kim commented 6 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).

It is tedious, that's why it hasn't been added to the default library so far. For each value which doesn't fit (could be activity name, but also unit, location, reference product), one needs to find the "correct" value.

You could use the biosphere flows from the US LCI, but then you wouldn't have any LCIA methods, and also would have the reverse problem linking e.g. ecoinvent activities to the US LCI biosphere flows.

So, there is no easy answer. This is the unfortunate state of LCA data exchange, and one of the reasons behind efforts like BONSAI.

aleksandra-kim commented 6 years ago

Original comment by Shubhankar Upasani (Bitbucket: Shubh1995, GitHub: Shubh1995).

Can you elaborate on how to apply

#!python

link_iterable_by_fields

after writing a migration? My question stands particularly for linking US LCI. I haven't imported ecoinvent 2.2 so I guess my code should be like

#!python

import functools
c = functools.partial(link_iterable_by_fields(unlinked = biosphere_flow_migration ,
    other=Database(config.biosphere)),
    kind='biosphere'
)
sp.apply_strategy(c)

where

#!python

biosphere_flow_migration

comes from the following migration definition

#!python

biosphere_flow_migration = Migration("US-LCI-biosphere").write(
    biosphere_migration_data, 
    description="unlinked biosphere exchanges in US LCI "
)

The error looks like this

#!python
TypeError                                 Traceback (most recent call last)
<ipython-input-93-e8a1bd027374> in <module>()
      1 import functools
      2 c = functools.partial(link_iterable_by_fields(unlinked = biosphere_flow_migration ,
----> 3     other=Database(config.biosphere)),
      4     kind='biosphere'
      5 )

~\AppData\Local\conda\conda\envs\tes\lib\site-packages\bw2io\strategies\generic.py in link_iterable_by_fields(unlinked, other, fields, kind, internal, relink)
     61                             "``database`` or ``code`` attributes")
     62 
---> 63     for container in unlinked:
     64         for obj in filter(filter_func, container.get('exchanges', [])):
     65             key = activity_hash(obj, fields)

TypeError: 'NoneType' object is not iterable

Any changes or suggestions? I don't think

#!python

sp.match_database

should work here.

aleksandra-kim commented 6 years ago

Original comment by Chris Mutel (Bitbucket: cmutel, GitHub: cmutel).

You should basically never be running link_iterable_by_fields yourself - it is normally wrapped by another call. In this case, you can call importer_object.migrate('name of migration'). See example, example.

brightway-lca / brightway2-io

US LCI database linking #47