Closed cmutel closed 4 years ago
The problem with the metals is that ReCiPe input data is a lying bastard:
It uses the wrong CAS numbers for metal ions, and the same wrong CAS number for multiple ions. You can see an example of correct CAS numbers here.
Our code does the right thing - it matches each flow individually. The reason we sometimes get the same number is that the input CFs are the same for different metal ions.
I guess the easiest way to fix this is to manually fix the CAS numbers themselves - although in this case we also need to make sure that ecoinvent uses the metal ion CAS numbers as well, which could also be a problem. Yeah!
About duplicates in general, the issue has been mostly solved it seems though.
Mineral Resource Scarcity Individualist number of CFs: 119 number of unique CFs: 117 Mineral Resource Scarcity Hierarchist number of CFs: 119 number of unique CFs: 117 Mineral Resource Scarcity Egalitarian number of CFs: 119 number of unique CFs: 117 Resources Mineral Resource Scarcity Individualist number of CFs: 119 number of unique CFs: 117 Resources Mineral Resource Scarcity Hierarchist number of CFs: 119 number of unique CFs: 117 Resources Mineral Resource Scarcity Egalitarian number of CFs: 119 number of unique CFs: 117
But given your message above, the underlying issue is not solved.
@romainsacchi Thanks. This was a problem with rhodium, in ground
and platinum, in ground
. They were defined separately, but were then also copied into Platinum-group metals
. Fixed in 04254f158cdffb607e476a91b2f88bb018d36b7f.
In addition to the metal ions, the following agricultural chemicals (I think) have duplicate CFs (with different numbers): Fenpropathrin, mecoprop, FENOXYCARB, 3-METHYLPYRIDINE.
OK, this is some serious bullshit. The reason we have duplicates on the these other four chemicals is that sometimes ecoinvent has just completely wrong CAS registry numbers. For example, we match both ReCiPe flows 3-METHYLPYRIDINE
and Acrolein
to the ecoinvent flow Acrolein
(UUID fa8bd05b-015d-5a82-878c-bde991551695
, line 60654 in ElementaryExchanges.xml
), but only in water. Why? Because, and again only for the flow to water, ecoinvent gives the CAS registry number of 3-Methylpyridine
, though they 3-Methylpyridine and Acrolein are completely different.
The flow of acrolein to water was added in 3.6. It's probable that the other news flows have similar CAS registry number problems.
Fenoxycarb is a different story (just to keep things spicy, I guess) - there are actually two forms of Fenoxycarb, with slightly different chemical structures and molecular weights (and therefore CAS registry numbers as well). Ecoinvent doesn't tell us which one we want, so I guess we will special case this one substance and choose the higher CF to be conservative.
@cmutel @romainsacchi I'm following these issues like a crunchy (but somewhat dry) soap opera. Are you going to be reporting these issues to ecoinvent and ReCiPe maintainers? It would be great to deal with these issues at the source, rather than developing patches that other method importers (e.g. future me) will have to develop in parallel...
@PascalLesage Indeed! Ecoinvent is relatively easy, for ReCiPe I want to do a more thorough analysis on the differences in the various versions of 1.1 (yes, really), as some important tox factors increased by 10.000 times...!
Here is another thing we are thinking about, just posted to our internal Slack:
OK, some good progress here. One non-technical question - how should we handle CFs for specific metal ions, when such ions are not included in the ecoinvent flow master data? For example, Hg(II) isn’t in ecoinvent, but has a CF. Simapro just applies this to all cations of mercury (and similarly for many other metals), but this feels wrong. On the other hand, not assigning a CF at all also feels wrong. Opinions? Obviously we should also contact the ReCiPe team and see what they think.
From @romainsacchi. Some flows are matched twice, resulting in too high CFs. For example:
This is because there are multiple objects in ReCiPe that correspond to the same ecoinvent flow, or because we mistakenly think that this is the case.