Closed fjuniorr closed 10 months ago
Mapping becomes straightforward if the system has a unique ID, even when properties change. This is under the assumption that any significant changes in properties would also change the flow's unique ID. If that's not the case, which is quite likely, we're looking at a different, but related, problem.
[...]
I'm not sure if the best approach is to save a state of certified mappings across specific systems and versions, or if we should assume the need to update the match rules to accommodate changes in the flow lists.
I don't see any way around this. The lists per system change, and we can't rely on things like uuid to be stable or map 1-to-1. For example, the flow Copper, 0.52% in sulfide, Cu 0.27% and Mo 8.2E-3% in crude ore, and all copper resource flows are now just copper in version 3.10 (with a different uuid).
I'm quoting this from https://github.com/fjuniorr/flowmapper/issues/1#issuecomment-1855189675 because it's relevant to the approach that I took.
To achieve 100% matching between ecoinvent 3.9 and 3.10, I've created two new match rules:
match_identical_uuid
: match identical UUIDsmatch_mapped_uuid
: match mapped UUIDs that come from flow list providers' releasesBoth rules match without comparing if the flow context is also equal, and both can be problematic if in one of the flow lists, the provided UUID is not actually a UUID.
It was also an opportunity to use https://github.com/fjuniorr/flowmapper/issues/50 so that the statistics print 100%. The python code is (you can't do this from the CLI interface):
from flowmapper.utils import read_field_mapping, read_flowlist
from flowmapper.flowmap import Flowmap
from flowmapper.flow import Flow
fields = read_field_mapping('config/ecoinvent-ecoinvent.py')
source_flows = [Flow.from_dict(flow, fields['source']) for flow in read_flowlist('data/ecoinvent-3.9-biosphere.json')]
target_flows = [Flow.from_dict(flow, fields['target']) for flow in read_flowlist('data/ecoinvent-3.10-biosphere.json')]
nomatch = lambda flow: flow.uuid == "91861063-1826-4860-9957-7c5bde5817a6" # There is no salt water flow in ecoinvent
flowmap = Flowmap(source_flows, target_flows, nomatch_rules=[nomatch])
flowmap.mappings
flowmap.statistics()
4717 source flows (1 excluded)...
4362 target flows...
4971 mappings (100.00% of total).
Mappings cardinalities: {'1:1': 3760, 'N:M': 342, '1:N': 42, 'N:1': 827}
@tngtudor Check this shit out! I feel like we are on the verge of a big leap in functionality.
After https://github.com/fjuniorr/flowmapper/pull/70 the python snippet is:
from flowmapper.utils import read_field_mapping, read_flowlist, read_migration_files
from flowmapper.flowmap import Flowmap
from flowmapper.flow import Flow
fields = read_field_mapping('config/ecoinvent-ecoinvent.py')
transformations = read_migration_files('config/ei3.9-ei3.10.json')
source_flows = [Flow(flow, fields['source'], transformations) for flow in read_flowlist('data/ecoinvent-3.9-biosphere.json')]
target_flows = [Flow(flow, fields['target']) for flow in read_flowlist('data/ecoinvent-3.10-biosphere.json')]
nomatch = lambda flow: flow.uuid_raw_value == "91861063-1826-4860-9957-7c5bde5817a6" # There is no salt water flow in ecoinvent
flowmap = Flowmap(source_flows, target_flows, nomatch_rules=[nomatch])
flowmap.mappings
flowmap.statistics()
4717 source flows (1 excluded)...
4362 target flows...
4717 mappings (100.00% of total).
Mappings cardinalities: {'1:1': 3927, 'N:1': 790}
To run this we need config/ei3.9-ei3.10.json.
Some basic statistics:
Looking only to 3.9->3.10 map 428 flows were deleted and 72 were added (by
uuid
). From the remaining 4290 only 418 had no change whatsoever in their properties.