Flow mapping - splitting one flow to two

bl-young commented 4 years ago

The flow mapping tool does not allow conversions from one flow into two (i.e. splitting a flow like "dioxins and furans", with a conversion factor for each like 0.5/0.5 applied by the user). This would be a helpful enhancement.

msrocka commented 4 years ago

+1 Do you have an idea how this could be expressed in the mapping table (how would such a split-mapping differ from another row in the table)?

When applying such a mapping on a database it would mean that we have to split LCIA characterization factors or process inputs and outputs where such flows are used (including formulas, uncertainty distributions, etc.). What if the target flows have different units? The conversion factor is then a mix of split- and unit conversion factor and does not need to add up to 1?

The tool also supports the mapping and replacement of product and waste flows. This would not work with such split-mappings when the product or waste flow is linked in the product system. Unlikely that someone would create such a mapping but we probably need to add another check for this.

I like the idea but there are some tricky details that need to be solved.

WesIngwersen commented 4 years ago

Currently this is expressed as two or more rows mapping the same source flow information to two or more different target flows, with the ConversionFactor in these records summing to 1. If this information was not sufficient, then a binary 'Split' field could be added accepting a '1' to indicate a split flow or '0' to indicate no split.

Yes some of those application details would be tricky. It could be limited to elementary flow application, and only for flows where the units were to same so the factors would have to add to 1 to simplify it. The assumption when applying it to characterization factors would be that they do not change since they are on a unit basis, and the split flows would assume to be characterized the same as the original.

msrocka commented 4 years ago

Maybe n:1 mappings could be also useful then? We could have a separate column in the mapping table that describes the mapping type: 1:1 (default), 1:n, and n:1. When updating the mapping tool it might make sense to cover all of these mapping types (as we need to split inputs, outputs, and characterization factors we can directly implement the inverse operation of merging them).

cmutel commented 9 months ago

A few comments from my side after wading into these waters...

randonneur has an explicit disaggregate verb. I'm not sure why one would want a check in the data format about the conversion factors adding to one - as @msrocka states, this is not always a true constraint. We don't put such a constraint on 1:1 matches. The user interface for generating the data files can show a warning if you think that would help prevent mistakes.
Probably our mental models are quite different, as in my head there is no such thing as a n:1 - these files are directional, and so this would just be separate rows who have different sources but the same target. These files can't generally be reversed; instead, one would need to run the generation software again, possibly with a different configuration or set of input data.

The tool also supports the mapping and replacement of product and waste flows. This would not work with such split-mappings when the product or waste flow is linked in the product system.

I also didn't really understand this. Why can't edges in the graph be disaggregated as well? I also think this might be a case where the specifics of an implementation is slipping into the data format itself, and that isn't a great sign. Databases or flows list being updated to include more detail is a not an optional use case, but rather a fact about the real world. Alkenes and alkynes are aliphatic organics. So when our background database replaces aliphatic organic chemicals with alkenes and alkynes, we need to be able to replace existing edges with two links, using the given production volume ratios in the absence of additional information.

msrocka commented 9 months ago

randonneur has an explicit disaggregate verb

Yes, a better format for describing such rules is required. The openLCA mapping format currently has a simple format for 1:1 mappings, each mapping as a row in the CSV file or UI table. For 1:n (and n:1) mappings it would be better to have a more structured format that cleanly describes such rules.

Probably our mental models are quite different, as in my head there is no such thing as a n:1

Probably,... in general I agree, and this is why our simple format also works for the n:1 mappings in most cases; but there is one thing: in impact categories it could generate multiple times the same characterization factor with the same flow, in processes multiple exchanges with the same flow with different amounts. The tool that applies such mappings/replacements could be smart enough to handle such cases (removing duplicate CFs, adding comments to the exchanges) but I think when there is a structured format for the 1:n case, there should be also a structured format for the n:1 case that explicitly describes that.

Why can't edges in the graph be disaggregated as well? I also think this might be a case where the specifics of an implementation is slipping into the data format itself, and that isn't a great sign.

You are of course right, links can be split when a rule with all information (what should be linked then, which factors should be applied etc.) for this is provided. I think it is more the current format that hinders the implementation.

GreenDelta / olca-app

Flow mapping - splitting one flow to two #102