USEPA / ElectricityLCI

Creative Commons Zero v1.0 Universal
24 stars 10 forks source link

eLCI standard process UUID #209

Open dt-woods opened 8 months ago

dt-woods commented 8 months ago

Notwithstanding the variability of version 3 and 4 UUIDs used throughout this model, there is some confusion regarding the standardization of process UUIDs.

In version 1.0.1, the following snippet from olca_jsonld_writer.py is used to create the standard UUID, of the form ModelType.PROCESS/[category]/[location]/[name]:

process.id = _uid( # version 3 UUID creator
  olca.ModelType.PROCESS,
  _val(d_vals, 'category', default=''),  # category
  _val(d_vals, 'location', 'name', default=''), # location
  _val(d_vals, 'name') # name
)

The way in which _val is written, if more than one 'path' argument is sent, the return is always NoneType. Therefore, "location" will always be none (since it searches for both 'location' and 'name' key names).

def _val(dict_d: dict, *path, **kvargs):
    if dict_d is None or path is None:
        return None
    v = dict_d
    for p in path:
        if not isinstance(v, dict): # (2) for 'b', finds v not a dict, returns none
            return None
        if p not in v:
            v = None
        else:
            v = v[p]  # (1) finds 'a', v = 1
    if v is None and 'default' in kvargs:
        return kvargs['default']
    return v

Demonstrated example:

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> _val(d, 'a')
1
>>> _val(d, 'a', 'b')
None

My question is, what attributes are intended for the standardized string for process UUID creation (and what are their proxy values)?

https://github.com/USEPA/ElectricityLCI/blob/2232c41f2cb4fd333ad59c8710aa55906e6a7ed3/electricitylci/olca_jsonld_writer.py#L28C17-L28C17

m-jamieson commented 7 months ago

The ultimate intent here is to have UUIDs generated using the namespace of the process - in this instance, the namespace would be the total "path" to that unit process within an openLCA database. So for an electricity generation process that would be something like, "modeltype.process/22: utilities/2211: electric power generation, transmission and distribution/coal/azps/Electricity - COAL - Arizona Public Service Company". But due to this bug, all process uuids would be generated as such: "modeltype.process/22: utilities/2211: electric power generation, transmission and distribution/coal/none/Electricity - COAL - Arizona Public Service Company". I don't think this is the biggest concern - once the bug is fix, the UUIDs will just change. I'm not aware of any work that relies on our process UUIDs to be consistent between v1.0.1 and the next version.