Boavizta / boaviztapi

🛠 Giving access to BOAVIZTA reference data and methodologies trough a RESTful API
GNU Affero General Public License v3.0
66 stars 23 forks source link

Implementation of the concepts of warning and margin of error #149

Closed da-ekchajzer closed 1 year ago

da-ekchajzer commented 1 year ago

This PR proposes a technical implementation of error margins and warnings at the level of impacts and boattributes. It adds the impact object, which allows tracing the metadata related to the environmental impacts.

It supposes that impact functions returns : value, significant_figure, error_margin, warnings.

Related issues : https://github.com/Boavizta/boaviztapi/issues/147 https://github.com/Boavizta/boaviztapi/issues/129

Not that this a a proposal. If we agree on it, i will update the tests accordingly.

List of modificaiton :

The json output would look something like that

{
  "impacts": {
    "gwp": {
      "manufacture": {
        "value": 1900,
        "significant_figures": 2,
        "error_margin": 0
      },
      "use": {
        "value": 260,
        "significant_figures": 2,
        "error_margin": 0
      },
      "unit": "kgCO2eq",
      "description": "Effects on global warming"
    },
    "adp": {
      "manufacture": {
        "value": 0.17,
        "significant_figures": 2,
        "error_margin": 0
      },
      "use": {
        "value": 0.000128,
        "significant_figures": 3,
        "error_margin": 0
      },
      "unit": "kgSbeq",
      "description": "Use of minerals and fossil ressources"
    },
    "pe": {
      "manufacture": {
        "value": 24000,
        "significant_figures": 2,
        "error_margin": 0
      },
      "use": {
        "value": 29800,
        "significant_figures": 3,
        "error_margin": 0
      },
      "unit": "MJ",
      "description": "Consumption of primary energy"
    }
  },
  "verbose": {
    "ASSEMBLY-1": {
      "units": 1,
      "manufacture": {
        "gwp": {
          "value": 6.68,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.00000141,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 68.6,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      }
    },
    "CPU-1": {
      "units": 2,
      "manufacture": {
        "gwp": {
          "value": 34,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.02,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 490,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "core_units": {
        "value": 24,
        "status": "DEFAULT"
      },
      "die_size_per_core": {
        "value": 0.51,
        "status": "COMPLETED",
        "unit": "mm2",
        "source": "https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)#Dual-core",
        "warnings": [
          "Maximizing value without cpu.core_unit given"
        ]
      },
      "model_range": {
        "value": "xeon gold",
        "status": "COMPLETED",
        "source": "from name"
      },
      "manufacturer": {
        "value": "intel",
        "status": "COMPLETED",
        "source": "from name"
      },
      "family": {
        "value": "skylake",
        "status": "COMPLETED",
        "source": "from name"
      }
    },
    "RAM-1": {
      "units": 12,
      "manufacture": {
        "gwp": {
          "value": 120,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.0049,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 1500,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "capacity": {
        "value": 32,
        "status": "INPUT",
        "unit": "GB"
      },
      "density": {
        "value": 0.625,
        "status": "DEFAULT",
        "unit": "GB/cm2"
      }
    },
    "SSD-1": {
      "units": 1,
      "manufacture": {
        "gwp": {
          "value": 24,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.0011,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 293,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "capacity": {
        "value": 400,
        "status": "INPUT",
        "unit": "GB"
      },
      "density": {
        "value": 50.6,
        "status": "INPUT",
        "unit": "GB/cm2"
      }
    },
    "POWER_SUPPLY-1": {
      "units": 2,
      "manufacture": {
        "gwp": {
          "value": 72.7,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.025,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 1050,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "unit_weight": {
        "value": 2.99,
        "status": "DEFAULT",
        "unit": "kg"
      }
    },
    "CASE-1": {
      "units": 1,
      "manufacture": {
        "gwp": {
          "value": 150,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.0202,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 2200,
          "significant_figures": 4,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "case_type": {
        "value": "rack",
        "status": "INPUT"
      }
    },
    "MOTHERBOARD-1": {
      "units": 1,
      "manufacture": {
        "gwp": {
          "value": 66.1,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0.00369,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 836,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      }
    },
    "USAGE": {
      "use": {
        "gwp": {
          "value": 260,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 260,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 260,
          "significant_figures": 2,
          "error_margin": 0,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "hours_electrical_consumption": {
        "value": 300,
        "status": "INPUT",
        "unit": "W"
      },
      "usage_location": {
        "value": "FRA",
        "status": "INPUT",
        "unit": "CodSP3 - NCS Country Codes - NATO"
      },
      "adp_factor": {
        "value": 4.86e-8,
        "status": "COMPLETED",
        "unit": "KgSbeq/kWh",
        "source": "ADEME BASE IMPACT"
      },
      "gwp_factor": {
        "value": 0.098,
        "status": "COMPLETED",
        "unit": "kgCO2e/kWh",
        "source": "https://www.sciencedirect.com/science/article/pii/S0306261921012149"
      },
      "pe_factor": {
        "value": 11.289,
        "status": "COMPLETED",
        "unit": "MJ/kWh",
        "source": "ADPf / (1-%renewable_energy)"
      },
      "use_time": {
        "value": 8785,
        "status": "INPUT",
        "unit": "hours"
      }
    }
  }
}
AirLoren commented 1 year ago

Not sure to understand the purpose of Warning field. I guess this field should explain error margins but you only provided one example that is not easy to understand and I am still not sure what we want to use this field for. Anyway, I understand that we will not be able to have warning field for the server and that we will only have them for the components. This doesn't seem to be a problem but I'd rather check. I also have a question regarding error_margin formula to evaluate global impact (server) error_margin from components error_margin. I understand that components error_margin are defined in our dataset but server error_margin should be calculated and therefore coded in the API.

da-ekchajzer commented 1 year ago

Thank you for your review

Not sure to understand the purpose of Warning field

It's a more general field where we can put several types of messages, whether it is about inconsistencies in user inputs, the use of unreliable methods, specifying the absence of an element in the calculation, ...

Anyway, I understand that we will not be able to have warning field for the server and that we will only have them for the components. This doesn't seem to be a problem but I'd rather check.

We can set warnings for each attribute (Boattribute) used in the evaluation and each impacts (at component or device level). I think that if we want to set general warnings to a server (or a component) we could set it on its impacts.

     "gwp": {
          "value": 66.1,
          "significant_figures": 3,
          "error_margin": 0,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
          "warnings":["lorem ipsum"]
        }

I also have a question regarding error_margin formula to evaluate global impact (server) error_margin from components error_margin. I understand that components error_margin are defined in our dataset but server error_margin should be calculated and therefore coded in the API.

Yes you're right. This PR was just to introduce the concept in the code, but we need to make methodological choices, and the way we aggregate the margin of error is one of them.

da-ekchajzer commented 1 year ago

Here are the elements I worked on following our exchanges:

Externalize component configuration

All default, min and max data are now externalized in a config file data/config.yml. It can be easily modified by us or by people who want to implement the API

Implementation min & max attributes in Boattributes and Impacts

Loading min & max in Boattributes at component creation

The default, min & max values are loaded in the Boattributes at the creation of the components from the conf file. It is possible to override a conf according to the context (CPU of a server does not have the same min/max/default values as that of a PC)

Implementation of min/max strategy for completed data

The min/max values of the completed data are modified according to several completions strategies.

In case where we search for a value in a filtered data frame (RAM and SSD)

In case where we search for the closest value in a data frame (CPU)

In case of a calculation (hour_electrical_consumption)

Min/max values are computed with the same calculation using the min and max value of each attributes

Propagation of min/max in impact assessments

The min and max value for each impact is computed using the min and max value of each attributes in used.

Factorization of the Boattribute values assignment

The assignment of the Boattributes values are now done with the set_{Status} methods. Boattributes with a status "INPUT" have by default : min = max = input

TODO

EXEMPLE

Empty SSD

{
  "impacts": {
    "gwp": {
      "manufacture": {
        "value": 100,
        "significant_figures": 2,
        "min": 2200,
        "max": 88000
      },
      "use": {
        "value": 0,
        "significant_figures": 1,
        "min": 0.00002,
        "max": 200000
      },
      "unit": "kgCO2eq",
      "description": "Effects on global warming"
    },
    "adp": {
      "manufacture": {
        "value": 0.0037,
        "significant_figures": 2,
        "min": 0.064,
        "max": 2.5
      },
      "use": {
        "value": 0,
        "significant_figures": 1,
        "min": 1e-11,
        "max": 0.05
      },
      "unit": "kgSbeq",
      "description": "Use of minerals and fossil ressources"
    },
    "pe": {
      "manufacture": {
        "value": 1280,
        "significant_figures": 3,
        "min": 27400,
        "max": 1090000
      },
      "use": {
        "value": 0,
        "significant_figures": 1,
        "min": 0.00001,
        "max": 80000000
      },
      "unit": "MJ",
      "description": "Consumption of primary energy"
    }
  },
  "verbose": {
    "units": {
      "value": 2,
      "status": "DEFAULT",
      "min": 1,
      "max": 8
    },
    "manufacture": {
      "gwp": {
        "value": 100,
        "significant_figures": 2,
        "min": 2200,
        "max": 88000,
        "unit": "kgCO2eq",
        "description": "Effects on global warming"
      },
      "adp": {
        "value": 0.0037,
        "significant_figures": 2,
        "min": 0.064,
        "max": 2.5,
        "unit": "kgSbeq",
        "description": "Use of minerals and fossil ressources"
      },
      "pe": {
        "value": 1280,
        "significant_figures": 3,
        "min": 27400,
        "max": 1090000,
        "unit": "MJ",
        "description": "Consumption of primary energy"
      }
    },
    "capacity": {
      "value": 1000,
      "status": "DEFAULT",
      "unit": "GB",
      "min": 100,
      "max": 5000
    },
    "density": {
      "value": 48.5,
      "status": "DEFAULT",
      "unit": "GB/cm2",
      "min": 0.1,
      "max": 1
    },
    "USAGE": {
      "use": {
        "gwp": {
          "value": 0,
          "significant_figures": 1,
          "min": 0.00002,
          "max": 200000,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 0,
          "significant_figures": 1,
          "min": 1e-11,
          "max": 0.05,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 0,
          "significant_figures": 1,
          "min": 0.00001,
          "max": 80000000,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "hours_electrical_consumption": {
        "value": 0,
        "status": "DEFAULT",
        "unit": "W",
        "min": 1,
        "max": 250
      },
      "usage_location": {
        "value": "EEE",
        "status": "DEFAULT",
        "unit": "CodSP3 - NCS Country Codes - NATO"
      },
      "adp_factor": {
        "value": 6.42e-8,
        "status": "DEFAULT",
        "unit": "KgSbeq/kWh",
        "source": "ADEME BASE IMPACT",
        "min": 1.32e-8,
        "max": 2.656e-7
      },
      "gwp_factor": {
        "value": 0.38,
        "status": "DEFAULT",
        "unit": "kgCO2e/kWh",
        "source": "https://www.sciencedirect.com/science/article/pii/S0306261921012149 : \nAverage of 27 european countries",
        "min": 0.023,
        "max": 0.9
      },
      "pe_factor": {
        "value": 12.874,
        "status": "DEFAULT",
        "unit": "MJ/kWh",
        "source": "ADPf / (1-%renewable_energy)",
        "min": 0.013,
        "max": 468.15
      },
      "use_time": {
        "value": 8760,
        "status": "DEFAULT",
        "unit": "hours",
        "min": 1,
        "max": 87600
      }
    }
  }
}

Complete SSD

INPUT

{
  "units":1,
  "capacity": 24,
  "manufacturer": "Samsung",
  "usage":{
     "usage_location":"FRA",
     "hours_electrical_consumption":10,
     "hours_use_time":1
   }
}

OUTPUT

{
  "impacts": {
    "gwp": {
      "manufacture": {
        "value": 7.3,
        "significant_figures": 2,
        "min": 7.3,
        "max": 7.3
      },
      "use": {
        "value": 0.001,
        "significant_figures": 1,
        "min": 0.001,
        "max": 0.001
      },
      "unit": "kgCO2eq",
      "description": "Effects on global warming"
    },
    "adp": {
      "manufacture": {
        "value": 0.00059,
        "significant_figures": 2,
        "min": 0.00059,
        "max": 0.00059
      },
      "use": {
        "value": 5e-10,
        "significant_figures": 1,
        "min": 5e-10,
        "max": 5e-10
      },
      "unit": "kgSbeq",
      "description": "Use of minerals and fossil ressources"
    },
    "pe": {
      "manufacture": {
        "value": 89.1,
        "significant_figures": 3,
        "min": 89.1,
        "max": 89.1
      },
      "use": {
        "value": 0.1,
        "significant_figures": 1,
        "min": 0.1,
        "max": 0.1
      },
      "unit": "MJ",
      "description": "Consumption of primary energy"
    }
  },
  "verbose": {
    "units": {
      "value": 1,
      "status": "INPUT"
    },
    "manufacture": {
      "gwp": {
        "value": 7.3,
        "significant_figures": 2,
        "min": 7.3,
        "max": 7.3,
        "unit": "kgCO2eq",
        "description": "Effects on global warming"
      },
      "adp": {
        "value": 0.00059,
        "significant_figures": 2,
        "min": 0.00059,
        "max": 0.00059,
        "unit": "kgSbeq",
        "description": "Use of minerals and fossil ressources"
      },
      "pe": {
        "value": 89.1,
        "significant_figures": 3,
        "min": 89.1,
        "max": 89.1,
        "unit": "MJ",
        "description": "Consumption of primary energy"
      }
    },
    "manufacturer": {
      "value": "Samsung",
      "status": "INPUT",
      "unit": "none"
    },
    "capacity": {
      "value": 24,
      "status": "INPUT",
      "unit": "GB"
    },
    "density": {
      "value": 53.6,
      "status": "COMPLETED",
      "unit": "GB/cm2",
      "source": "Samsung",
      "min": 53.6,
      "max": 53.6
    },
    "USAGE": {
      "use": {
        "gwp": {
          "value": 0.001,
          "significant_figures": 1,
          "min": 0.001,
          "max": 0.001,
          "unit": "kgCO2eq",
          "description": "Effects on global warming"
        },
        "adp": {
          "value": 5e-10,
          "significant_figures": 1,
          "min": 5e-10,
          "max": 5e-10,
          "unit": "kgSbeq",
          "description": "Use of minerals and fossil ressources"
        },
        "pe": {
          "value": 0.1,
          "significant_figures": 1,
          "min": 0.1,
          "max": 0.1,
          "unit": "MJ",
          "description": "Consumption of primary energy"
        }
      },
      "hours_electrical_consumption": {
        "value": 10,
        "status": "INPUT",
        "unit": "W"
      },
      "usage_location": {
        "value": "FRA",
        "status": "INPUT",
        "unit": "CodSP3 - NCS Country Codes - NATO"
      },
      "adp_factor": {
        "value": 4.86e-8,
        "status": "COMPLETED",
        "unit": "KgSbeq/kWh",
        "source": "ADEME BASE IMPACT",
        "min": 4.86e-8,
        "max": 4.86e-8
      },
      "gwp_factor": {
        "value": 0.098,
        "status": "COMPLETED",
        "unit": "kgCO2e/kWh",
        "source": "https://www.sciencedirect.com/science/article/pii/S0306261921012149",
        "min": 0.098,
        "max": 0.098
      },
      "pe_factor": {
        "value": 11.289,
        "status": "COMPLETED",
        "unit": "MJ/kWh",
        "source": "ADPf / (1-%renewable_energy)",
        "min": 11.289,
        "max": 11.289
      },
      "use_time": {
        "value": 1,
        "status": "INPUT",
        "unit": "hours"
      }
    }
  }
}
Amael-PE commented 1 year ago

Hi @da-ekchajzer First, I find quite strange to get the significant_figures but not the number adapted. Let's take 0.000128, with 3 significants figures... It is basically 0. So either adapt the result or don't prvide significant results would be less confusing options in my opinion. Regarding the error margin, I believe the best is to choose between convention (+/-, % deviation, variance, min-max, empirical variances/metrics) and stick to it. My only point for min/max is that they are easy to implement on coding part, but lack information on probability distribution in case of non-normal law.

da-ekchajzer commented 1 year ago

Thanks for your comment

First, I find quite strange to get the significant_figures but not the number adapted. Let's take 0.000128, with 3 significants figures... It is basically 0. So either adapt the result or don't prvide significant results would be less confusing options in my opinion.

Since all the 0 at the left are not significant figures, 0.000128 has 3 significant figures. We could have put 0 but the rules for returning 0 would be very tricky since it depends on the unit we use. We rather let the user round its numbers if wanted on the client side.

Regarding the error margin, I believe the best is to choose between convention (+/-, % deviation, variance, min-max, empirical variances/metrics) and stick to it. My only point for min/max is that they are easy to implement on coding part, but lack information on probability distribution in case of non-normal law.

Yes ! You make an important point. In this version, the min/max are meaningless because they do not correspond to entities that exist in the real world. Without knowledge of the static distribution of the technical configurations (which we may have one day with crowdsourcing) we can't recover them by a probabilistic approach either. As mentioned in the issue we could use MonteCarlo but that seems far too heavy.

One solution we thought of (and I think I will implement) is to have a min max relative to the user-given archetypes:

This different configuration would correspond to real entities