IM3S / LDAR_Sim

MIT License
2 stars 0 forks source link

Thinking about emissions data #106

Open tarcadius opened 3 years ago

tarcadius commented 3 years ago

The purpose of this issue is to document a discussion around how the virtual world in LDAR-Sim produces emissions. A number of factors motivate the need for this issue:

  1. Emissions from the real world are represented in LDAR-Sim using real empirical data. All real-world measurement campaigns are incomplete in some way. For example, OGI surveys may miss large sources, while screening technologies may miss small sources. Raw emissions distributions from measurement campaigns are therefore not representative of reality, so using them as the basis for the virtual world is to create a biased model.
  2. Measurements that go into LDAR-Sim can work in two ways - using bottom-up measurements of individuals leaks/events, or using site-level measurements. Bottom-up measurements are usually incomplete. Top-down are more complete, but miss anything below the detection limit and can't distinguish between fugitives and vents. Right now LDAR-Sim uses bottom-up measurements. Top-down measurements are probably better (include everything), but would need to be disaggregated into individual sources.
  3. LPR is (must be) tied to the method used to collect empirical data for bottom-up. If the detection limit is high, the LPR will be an underestimate of the true LPR. Also, LPR conceptually doesn't apply to top-down measurements, which include regular design sources that are not "produced" according to an LPR but instead remain.
  4. Efforts have been made to develop extremely comprehensive bottom-up inventories (e.g., Rutherford paper) that could be built into LDAR-Sim. It's going to be some work for someone to explore how this might play out/work. It would likely require important structural changes to LDAR-Sim to incorporate individual components, the right equipment, and standard facility types.
  5. Significant amounts of top-down site level data also exists (e.g., Alvarez). Should we focus on using this data and disaggregating it?
  6. The best solution might be to combine 4 and 5, doing what Rutherford et al., did but working backwards to disaggregate site-level emissions distributions into individual sources.

As more of a broader, philosophical question, should we focus on trying to reproduce reality or simply what's relevant in a policy context (i.e., OGI)? Trying to reproduce reality, especially anything below detection limits for OGI, is really difficult and likely not very important.

Other thoughts/considerations? @tbarchyn @KeeganShaw-GIS @soroushojagh @AkaMrmuffin @tybob-gough @c-vollrath

tbarchyn commented 3 years ago

Sure I'll bite. I think LDAR Sim faces several challenges (some alluded to above):

  1. LDAR is only applicable to one component of emissions - a full emissions management tool may have broader use cases.
  2. Empirical data could (should) play a larger role. This circles out to the accuracy of the tool in evaluation case (e.g., is it right?), and in improvement case (e.g., can empirical data parameterize the model better)? Evaluation is pretty important for survival of the whole approach (LDAR Sim / FEAST, etc.) because models that are more accurate are more valuable. And of course missing out on new empirical data to make it better (e.g., at different scales), is a lost opportunity.

To revision LDAR Sim into a tool that is about emissions management, it may be worthwhile to brainstorm what such a tool should look like before adapting code. I think an emissions management toolbench is kind of 3 different components:

  1. A digital twin of emissions, with variable fidelity. This is a full statistical representation of emissions from sites, incorporating data at various scales. For full disclosure, we're slowly working on this for pomelo - I'm really keen to build the statistics tools to mix scales of data because data at different scales is what we have a lot of and I'm so nervous about selective data curation as a tool of narrative control.
  2. An evolution model for emissions at sites. What happens to emissions at these sites through time? This is probably mostly LPR, but the scope needs to be expanded if changing to emissions in general. I think the idea applies at site scale too, but probably can't call it LPR.
  3. Treatment models, this is LDAR basically. Things that change the digital twin, but from the perspective of emissions in general, there is more than just LDAR that changes emissions. This feels like the most mature component. There has been so much effort on producing pretty detailed and deterministic treatment models for things like OGI - but really 1. and 2. have not received the same brainpower and the volume of empirical data beginning to emerge probably suggest 1. and 2. should be focus.

Same model applies to things like drug trials: 1 is all about measuring the people in the trial, 2 is all about modeling normal change over time, and 3 is all about the treatment (compared to a control). Drug trials differ from LDAR in that it is 100% empirical, but same idea.

But to circle out, my general thinking here is to think about how precisely to adapt LDAR Sim to evolve to (i) be more about emissions in general, and (ii) incorporate more empirical data.

The empirical data situation in methane is improving, but generally from science perspective remains super sketch - there is a real need for a serious and surly 'data czar' to understand precisely what any data means - it is not easy to use.