Closed da-ekchajzer closed 9 months ago
After exchanging with @ggael and @EtienneLeesPerasso, here are the group's advances on error margins.
Notes (in French) : https://wiki.boavizta.org/share/edc299e1-9d5c-4045-942d-29c8a1243883
In our case, the input data are the technical characteristics and the usage assumptions. The uncertainty depends on the strategy used:
Rarely taken into account
In the case of secondary data (impact factor) the margins of error are generally given. However, the impact factors we use are not reported with uncertainties
Rarely taken into account The impact criteria also carry margins of error from standardization. In the case of GWP, for example, the IPCC produces confidence intervals for the normalization of greenhouse gases.
Here are several possible implementations. Note that these possibilities can be combined.
The first possibility is to propose maximum and minimum values for each of the data and to propagate them. We can then provide the user with min & max values for all technical characteristics and impacts.
Another possibility is to propose for each value a probability law based on a representative distribution. Error propagation requires establishing correlations between values or assuming that there are no correlations. We can then obtain a confidence interval at X% (typically 95%). X can be given by the user.
This is what was done by @ggael for ecodiag with log normal laws based on manufacturers' data.
We could infer normal probability laws based on the min and max bounds, assuming a standard deviation.
Notes that for Energizta (github.com/boavizta/energizta/) the constitution of a collaborative data set will allow by construction the creation of probabilistic distribution for electrical consumption models.
Seems too heavy for our purposes
Pedigree matrices are used in LCA to determine a confidence interval based on a qualitative analysis of the data. They are often available for secondary data.
In the case where it is not available (which is our case), some criteria are difficult to determine without access to the LCA details.
It could be interesting to use this method for impact factors from secondary sources, with the risk of having to assume certain criteria.
A python implementation of the Pedigree matrix : https://github.com/brightway-lca/pedigree_matrix
As you can see, we have many possibilities. I have my own opinion on the matter, but I'd like to hear your opinions.
A first implementation has been made for the upcoming (#v0.3). The first approach uses a min/max implementation that take into account the min & max values of the archetype (i.e the category of the device/components) when an attribute is completed. The error margin relative to the impacts factor are not taken into account for now. I will document this new behaviour in the documentation.
Problem
Our impact modeling is subject to very large margins of error on two types of values:
To account for the difficulty and imprecision of the assessments and to provide transparency to users, we should report margins of error on the returned impacts.
Evaluate error margin
How should we evaluate both error margin from technical characteristics and impacts factors ?
We should think more on how to implement this feature. Any suggestions ? @AirLoren, @samuelrince, @benjaminlebigot ?
@ThibaultPirson should send us some interesting data.
Implementation
We should add an
error_margin_%
attribute in percentage