wacl-york / mcm-web

Code for the MCM web application
1 stars 2 forks source link

Enforce breadth-first ordered mechanism extract #175

Open stulacy opened 12 months ago

stulacy commented 12 months ago

Ideally extracted sub-mechanisms would be in breadth first order, i.e. for a CH4 extract it first shows all reactions where CH4 is a reactant, then all reactions where products of CH4 are reactants etc... The way in which this is currently achieved doesn't guarantee this order and could break in future versions of SQLite.

What currently is done is:

  1. Traverse down the mechanism recursively finding all products starting at the root VOC
  2. Find all reactions where these products are reactants
  3. Then find all species that are involved in these reactions

The reason for step 3 is that there are 2 edge cases where Step 1 doesn't find all the species involved in the mechanism. The first is that any species that are reactants alongside the root VOC won't be included, and also any other species that are reactants further down without being products themselves. The only example I've seen so far is CL so it is likely to impact inorganics only, but this cannot be guaranteed. Step 1 seems to return a breadth-first search, but this isn't explicitly ordered.

Instead it would be preferable to recursively navigate down by reactions instead to skip out step 3, as well as doing an explicit breadth-first ordering. This is possible with a recursive CTE, but it only works in its default depth-first order. Following the suggested way to do a breadth first search in the SQLite docs (3.4) doesn't work, as the UNION no longer limits duplicate reactions filling up the queue as it's now only looking for duplicate reactions at the same depth. This seems to recurse infinitely until it hits a limit and is terminated.

I think this is arising because the mechanism isn't a tree like in that example, but a directed graph. I'm not sure if there's an easy way of getting around this in SQLite, except maybe to make an Edge table (Reactant | Product | ReactionID) and use a traversal method more suited to graphs.

Otherwise an option is to iteratively traverse the mechanism at each step pulling the reactions into Ruby and doing the ordering and enforcing uniqueness there, although I'm not sure how to do this elegantly in Ruby.

stulacy commented 10 months ago

Steps 1 & 2 were combined into a single step in 6ff5bb, resolving that issue. However, the search is still implicitly rather than explicitly ordered. Explicitly ordering would involve adding the depth to the recursive CTE (3.4), but as mentioned in the above comment, this recurses infinitely.