fjuniorr / flowmapper-ci

Bot for running flowmapper
0 stars 1 forks source link

Map ecoinvent 3.9 to 3.10 #13

Closed fjuniorr closed 10 months ago

fjuniorr commented 10 months ago

Some basic statistics:

flowmapper map --fields config/ecoinvent-ecoinvent.py --output-dir mappings data/ecoinvent-3.6-biosphere.json data/ecoinvent-3.7-biosphere.json
# 4318 source flows...
# 4329 target flows...
# 5242 mappings (99.95% of total).
# Mappings cardinalities: {'1:1': 4055, 'N:M': 1170, 'N:1': 8, '1:N': 9}

flowmapper map --fields config/ecoinvent-ecoinvent.py --output-dir mappings data/ecoinvent-3.7-biosphere.json data/ecoinvent-3.8-biosphere.json
# 4329 source flows...
# 4421 target flows...
# 5386 mappings (99.91% of total).
# Mappings cardinalities: {'1:1': 3911, 'N:1': 114, 'N:M': 1300, '1:N': 61}

flowmapper map --fields config/ecoinvent-ecoinvent.py --output-dir mappings data/ecoinvent-3.8-biosphere.json data/ecoinvent-3.9-biosphere.json
# 4421 source flows...
# 4718 target flows...
# 5463 mappings (98.42% of total).
# Mappings cardinalities: {'1:1': 3934, 'N:M': 1370, 'N:1': 106, '1:N': 53}

flowmapper map --fields config/ecoinvent-ecoinvent.py --output-dir mappings data/ecoinvent-3.9-biosphere.json data/ecoinvent-3.10-biosphere.json
# 4718 source flows...
# 4362 target flows...
# 4629 mappings (92.92% of total).
# Mappings cardinalities: {'1:1': 4036, 'N:M': 330, '1:N': 41, 'N:1': 222}

Looking only to 3.9->3.10 map 428 flows were deleted and 72 were added (by uuid). From the remaining 4290 only 418 had no change whatsoever in their properties.

fjuniorr commented 10 months ago

Mapping becomes straightforward if the system has a unique ID, even when properties change. This is under the assumption that any significant changes in properties would also change the flow's unique ID. If that's not the case, which is quite likely, we're looking at a different, but related, problem.

[...]

I'm not sure if the best approach is to save a state of certified mappings across specific systems and versions, or if we should assume the need to update the match rules to accommodate changes in the flow lists.

I don't see any way around this. The lists per system change, and we can't rely on things like uuid to be stable or map 1-to-1. For example, the flow Copper, 0.52% in sulfide, Cu 0.27% and Mo 8.2E-3% in crude ore, and all copper resource flows are now just copper in version 3.10 (with a different uuid).

I'm quoting this from https://github.com/fjuniorr/flowmapper/issues/1#issuecomment-1855189675 because it's relevant to the approach that I took.

To achieve 100% matching between ecoinvent 3.9 and 3.10, I've created two new match rules:

Both rules match without comparing if the flow context is also equal, and both can be problematic if in one of the flow lists, the provided UUID is not actually a UUID.

It was also an opportunity to use https://github.com/fjuniorr/flowmapper/issues/50 so that the statistics print 100%. The python code is (you can't do this from the CLI interface):

from flowmapper.utils import read_field_mapping, read_flowlist
from flowmapper.flowmap import Flowmap
from flowmapper.flow import Flow

fields = read_field_mapping('config/ecoinvent-ecoinvent.py')
source_flows = [Flow.from_dict(flow, fields['source']) for flow in read_flowlist('data/ecoinvent-3.9-biosphere.json')]
target_flows = [Flow.from_dict(flow, fields['target']) for flow in read_flowlist('data/ecoinvent-3.10-biosphere.json')]

nomatch = lambda flow: flow.uuid == "91861063-1826-4860-9957-7c5bde5817a6" # There is no salt water flow in ecoinvent

flowmap = Flowmap(source_flows, target_flows, nomatch_rules=[nomatch])

flowmap.mappings
flowmap.statistics()
4717 source flows (1 excluded)...
4362 target flows...
4971 mappings (100.00% of total).
Mappings cardinalities: {'1:1': 3760, 'N:M': 342, '1:N': 42, 'N:1': 827}
cmutel commented 10 months ago

@tngtudor Check this shit out! I feel like we are on the verge of a big leap in functionality.

fjuniorr commented 10 months ago

After https://github.com/fjuniorr/flowmapper/pull/70 the python snippet is:

from flowmapper.utils import read_field_mapping, read_flowlist, read_migration_files
from flowmapper.flowmap import Flowmap
from flowmapper.flow import Flow

fields = read_field_mapping('config/ecoinvent-ecoinvent.py')
transformations = read_migration_files('config/ei3.9-ei3.10.json')
source_flows = [Flow(flow, fields['source'], transformations) for flow in read_flowlist('data/ecoinvent-3.9-biosphere.json')]
target_flows = [Flow(flow, fields['target']) for flow in read_flowlist('data/ecoinvent-3.10-biosphere.json')]

nomatch = lambda flow: flow.uuid_raw_value == "91861063-1826-4860-9957-7c5bde5817a6" # There is no salt water flow in ecoinvent

flowmap = Flowmap(source_flows, target_flows, nomatch_rules=[nomatch])

flowmap.mappings
flowmap.statistics()
4717 source flows (1 excluded)...
4362 target flows...
4717 mappings (100.00% of total).
Mappings cardinalities: {'1:1': 3927, 'N:1': 790}

To run this we need config/ei3.9-ei3.10.json.