eco4cast / unconf-2023

Brainstorming repo to propose and discuss unconference project ideas!
12 stars 0 forks source link

Enhancing decision relevance of ocean forecasting efforts through ML emulation of process-based models. #22

Open nonlinearnature opened 1 year ago

nonlinearnature commented 1 year ago

Background

There is a rising need and opportunity for ocean management and planning that is ecologically based, including in the broadest sense of the fully integrated sociocultural and biophysical systems. Marine protected area network design and offshore wind lease planning are two examples that can be brought to the table. Ecosystem insights often boil down to questions about alternative scenarios, both in the management decisions (e.g. which polygon to draw for a protect area or lease block) and in the ecological context that will determine their success or failure in achieving just and sustainable stewardship (e.g. complex regional oceanographic consequences of broad climate warming scenarios).

(Additional background to come).

Problem Statement

Sophisticated ocean forecasting tools have major barriers to implementation into current management needs and contexts in U.S. territorial waters. Specific causes include

Mismatches ranging from:

Computational burden in:

Proposed Approach

Machine learning offers vast promise is bridging gaps between data and applications; however its immediate use to solve this problem might seem exactly contrarywise. The most enticing way forward would perhaps be a combination of ANN and equation learning algorithms (I.e. learning dimensionality reduction that can be expressed with algebraic/differential expressions and equations). Moreover, by running the tailor process as a forecasting exercise, it should be possible to have an end product adapted to the particular needs of the end user/application/decision context.

nonlinearnature commented 1 year ago

Additional Background

In the Gulf of Maine, scientific observation and analysis have a comparatively long history relative to many other areas of the ocean. Nevertheless, the space of things we don’t know dwarfs that which we have even begun to understand. Powerful ocean prediction and forecasting approaches have been steadily rolling out in the recent decade-plus (e.g. FVCOMs, NECOFs, biogeochemically integrated ROMs). Nevertheless, these systems are typically designed and executed within a specific project context, and carrying those insights into subsequent quantitative work turns out to still be very hard.

In constructing an end-to-end dynamic and spatially explicit ecosystem model, the CHANS team at BU has been wrestling with these issues. One of the biggest gaps in understanding is at the very bottom of the trophic web. The best-case would be to implement a realistic, process-based model of upwelling and primary production that can fit within the MIMES model structure, predict observed behavior, and replicate expert predictions of future change of the coming 3 decades. Cutting edge research using ROMs and other physical downscaling frameworks have begun to deliver on those 2nd two criteria, but are vast and complicated models that produce petabytes of data at a time. This is all to say, they do not readily fit in as a single box in a trophic model complicated in its own right.

cboettig commented 1 year ago

@nonlinearnature Thanks, this is really interesting, but seems pretty broad. If I'm following, the goal is to improve existing sophisticated ocean forecasting tools using ML emulation of process-based models? Is the basic goal that the ML emulation would be used to accelerate existing accurate but computationally intensive simulation-based forecasts?

Can you provide some links or citations to a specific model whose forecast performance one would compare to as a baseline? Are there some tasks you could imagine doing here with participants who are not already experts in FVCOMS/NECOFs/MIMES?

nonlinearnature commented 1 year ago

Thanks @cboettig. I think probably the largest barrier to productive work on this idea at the un-conference would be having the right data in hand already sculpted and wrangled. The biggest outstanding need in the ecosystem modeling on the management end is at the bottom of the food web. There have been efforts, e.g. with COBALT to do biogeochemistry cooupled to regional ocean physics. A product like that could be focused on for a self-contained emulating study. Nothing like that has been implemented as a live updating forecast model as far as I know, though.

The things I have on hand would also work for a project I think but are less than ideal. The NECOFS GOM3 model likely resolves a lot of the physics that are immediately relevant, and also has a forward facing forecasting product through NERACOOOS (http://134.88.228.119:8080/fvcomwms/). The nutrient concentrations themselves are only measured in a few places, but should then in turn be coupled very closely to surface chlorophyll. (Speaking of, I could repackage all this possibly as a spatial forecasting exercise for surface chlorophyll per your project suggestion that started the thread).

Is the basic goal that the ML emulation would be used to accelerate existing accurate but computationally intensive simulation-based forecasts?

The immediate goal would be a model that has comparable forecast skill to the much more intensive procssed-based model it emulates but that can be "plugged in" as a module in a spatio-temporally dynamic ecosystem model like MIMES or ATLANTIS.

Here's an ATLANTIS example, albeit at a much coarser spatial scale than MPA and OSW planning: https://www.sciencedirect.com/science/article/pii/S030438002200148X.

Here's an overview of MIMES: https://doi.org/10.1016/j.ecoser.2015.01.004

The current MIMES modeling for OSW planning in the Gulf of Maine is in progress but extends across the U.S. waters of the Gulf of Maine with roughly 100,000 m^2 hexagons, daily time-step, 15 food-web components.

rqthomas commented 1 year ago

As the discussion continues, I want to encourage you to envision how this can apply to or benefit forecasting using NEON data in some capacity.

nonlinearnature commented 1 year ago

After some offline discussion to address Quinn's above prompt, the project would probably better be focused as "Enhancing ecological forecasting flexibility in aquatic systems through ML emulation of process-based physical models."