fjuniorr / flowmapper

Mappings between elementary flows
MIT License
0 stars 1 forks source link

Mapping from SImapro flowlists with only base context values generates `1:N`, `N:1` and `N:M` mappings #13

Closed fjuniorr closed 7 months ago

fjuniorr commented 7 months ago

Here's an extract from simapro_ecoinvent_elementary_flows:

To the best of our understanding, the elementary flow list provided by PRé uses a different model for the Context value than what is used in ecoinvent. For one thing, they have one UUID for flows, regardless of the subcategory Context values. Indeed, the master data list includes only the base Category values

The impact is that mappings from Simapro flowlists with only base context values are not majorly 1:1. From the 1646 unique Simapro flows mapped in SimaProv94-ecoinventEFv3.7.csv the breakdown is:

From the description of FlowMap and FlowMapEntry I couldn't understand if the specs were created considering this case, but the issue https://github.com/GreenDelta/olca-app/issues/102 seems to indicate that it was not[^20231119T173534].

[^20231119T173534]: Although this wouldn't necessarily change the open LCA schema spec.

cmutel commented 7 months ago

I think we might be going down the wrong path here - the provided SimaPro flow list is not what we would see in the real world. Here is an extract of an actual flow list, including full contexts:

simapro-flows.json.zip

(Sorry, should have communicated on this more clearly earlier)

cmutel commented 7 months ago

@fjuniorr You will notice that there will be some flows like Ammonia, NL, which are squishing a name and a location code together. We need to separate these in a separate transformation, and store the location (as either location or geography).

The pattern I have seen is that:

fjuniorr commented 7 months ago

I think we might be going down the wrong path here - the provided SimaPro flow list is not what we would see in the real world. Here is an extract of an actual flow list, including full contexts:

@cmutel do you know if the SimaPro flow list will always have full contexts? I was operating under the assumption that sometimes it would have, and other times it would not.

You will notice that there will be some flows like Ammonia, NL, which are squishing a name and a location code together. We need to separate these in a separate transformation, and store the location (as either location or geography).

I will track this down in https://github.com/fjuniorr/flowmapper/issues/14.

But IHMO this is a good example of a mapping that could be defined as an N:1 transformation since ecoinvent doesn't appear to define a location as a property of a flow:

<elementaryExchange id="9990b51b-7023-4700-bca0-1a32ef921f74" unitId="487df68b-4994-4027-8fdc-a4dc298257b7" casNumber="007664-41-7">
  <name xml:lang="en">Ammonia</name>
  <unitName xml:lang="en">kg</unitName>
  <compartment subcompartmentId="e8d7772c-55ca-4dd7-b605-fee5ae764578">
    <compartment xml:lang="en">air</compartment>
    <subcompartment xml:lang="en">urban air close to ground</subcompartment>
  </compartment>
  <property propertyId="6393c14b-db78-445d-a47b-c0cb866a1b25" amount="0"/>
  <property propertyId="6d9e1462-80e3-4f10-b3f4-71febd6f1168" amount="0"/>
  <property propertyId="a9358458-9724-4f03-b622-106eda248916" amount="0">
    <comment xml:lang="en">water mass/dry mass</comment>
  </property>
  <property propertyId="c74c3729-e577-4081-b572-a283d2561a75" amount="0"/>
  <property propertyId="67f102e2-9cb6-4d20-aa16-bf74d8a03326" amount="1"/>
  <property propertyId="3a0af1d6-04c3-41c6-a3da-92c4f61e0eaa" amount="1"/>
</elementaryExchange>
cmutel commented 7 months ago

@cmutel do you know if the SimaPro flow list will always have full contexts? I was operating under the assumption that sometimes it would have, and other times it would not.

It always will, the final substance list you link to was prepared for me by special request; the data we get from SimaPro exports always have the full context.

But IHMO this is a good example of a mapping that could be defined as an N:1 transformation since ecoinvent doesn't appear to define a location as a property of a flow

Ecoinvent doesn't because it is wrong (I will die on this particular hill). Elementary flows - the things we are talking about - are not located in time and space. That information comes from the processes which they interact with.

The inclusion of locations into elementary flows lists is a horrible hack to overcome poor data models choices or an inability to update legacy software.

Which is a long way to say that you are correct 😛

fjuniorr commented 7 months ago

@cmutel do you know if the SimaPro flow list will always have full contexts? I was operating under the assumption that sometimes it would have, and other times it would not.

It always will, the final substance list you link to was prepared for me by special request; the data we get from SimaPro exports always have the full context.

In one list we have:

  {
    "name": "Carbon dioxide, in air",
    "context": [
      "Resources",
      ""
    ],
    "unit": "kg"
  }

and in the one you linked above we have:

  {
    "name": "Carbon dioxide, in air",
    "categories": [
      "Raw",
      "(unspecified)"
    ],
    "unit": "kg",
    "CAS": "000124-38-9"
  }

From what you wrote here I think the difference is because exports from master flow list / LCI file / LCIA file can be slightly different, correct?

Also, is it safe to assume that empty secondary contexts should always be equal to unspecified and both flows are the same?

cmutel commented 7 months ago

the data we get from SimaPro exports always have the full context.

Well, you learn something new every day. I thought that this would always be ["Resources", "(unspecified)"]. I wonder if this was a data entry mistake, not from the "official" flow list.

From what you wrote here I think the difference is because exports from master flow list / LCI file / LCIA file can be slightly different, correct?

I guess so? But I think we won't be able to capture the full creativity of SimaPro users, and if we fail under unexpected circumstances that is OK.

Also, is it safe to assume that empty secondary contexts should always be equal to unspecified and both flows are the same?

Yes - and Brightway even removes (unspecified).

fjuniorr commented 7 months ago

I'm closing this since the actionable itens have their own issues.